How I made $500k with machine learning and HFT
How I made $500k with machine learning and HFT (high frequency trading)
This postbode will detail what I did to make approx. 500k from high frequency trading from 2009 to 2010. Since I wasgoed trading entirely independently and am no longer running my program I&rsquo,m glad to tell all. My trading wasgoed mostly te Russel 2000 and DAX futures contracts.
The key to my success, I believe, wasgoed not ter a sophisticated financial equation but rather ter the overall algorithm vormgeving which tied together many plain components and used machine learning to optimize for maximum profitability. You won&rsquo,t need to know any sophisticated terminology here because when I setup my program it wasgoed all based on intuition. (Andrew Ng&rsquo,s amazing machine learning course wasgoed not yet available – omzetbelasting if you click that verbinding you&rsquo,ll be taken to my current project: CourseTalk, a review webpagina for MOOCs)
Very first, I just want to demonstrate that my success wasgoed not simply the result of luck. My program made 1000-4000 trades vanaf day (half long, half brief) and never got into positions of more than a few contracts at a time. This meant the random luck from any one particular trade averaged out pretty quick. The result wasgoed I never lost more than $2000 ter one day and never had a losing month:
(EDIT: Thesis figures are after paying commissions)
And here&rsquo,s a chart to give you a sense of the daily variation. Note this excludes the last 7 months because – spil the figures stopped going up – I lost my motivation to inject them.
Prior to setting up my automated trading program I&rsquo,d had Two years practice spil a &ldquo,manual&rdquo, day trader. This wasgoed back ter 2001 – it wasgoed the early days of electronic trading and there were opportunities for &ldquo,scalpers&rdquo, to make good money. I can only describe what I wasgoed doing spil akin to playing a movie spel / gambling with a supposed edge. Being successful meant being prompt, being disciplined, and having a good intuitive pattern recognition abilities. I wasgoed able to make around $250k, pay off my student loans and have money left overheen. Win!
Overheen the next five years I would launch two startups, picking up some programming abilities along the way. It wouldn&rsquo,t be until late 2008 that I would get back into trading. With money running low from the sale of my very first startup, trading suggested hopes of some quick specie while I figured out my next stir.
Ter 2008 I wasgoed &ldquo,by hand&rdquo, day trading futures using software called T4. I&rsquo,d bot wanting some customized order entry hotkeys, so after discovering T4 had an API, I took on the challenge of learning C# (the programming language required to use the API) and went ahead and built myself some hotkeys.
After getting my feet raw with the API I soon had thicker aspirations: I dreamed to train the rekentuig to trade for mij. The API provided both a stream of market gegevens and an effortless way to send orders to the exchange – all I had to do wasgoed create the logic te the middle.
Below is a screenshot of a T4 trading window. What wasgoed cool is that when I got my program working I wasgoed able to observe the laptop trade on this precies same interface. Watching existente orders popping ter and out (by themselves with my efectivo money) wasgoed both thrilling and scary.
The vormgeving of my algorithm
From the outset my objective wasgoed to setup a system such that I could be reasonably certain I&rsquo,d make money before everzwijn making any live trades. To accomplish this I needed to build a trading simulation framework that would – spil accurately spil possible – simulate live trading.
While trading ter live mode required processing market updates streamed through the API, simulation mode required reading market updates from a gegevens verkeersopstopping. To collect this gegevens I setup the very first version of my program to simply connect to the API and record market updates with timestamps. I ended up using Four weeks worth of latest market gegevens to train and test my system on.
With a basic framework ter place I still had the task of figuring out how to make a profitable trading system. Spil it turns out my algorithm would pauze down into two distinct components, which I&rsquo,ll explore te turn:
- Predicting price movements, and
- Making profitable trades
Predicting price movements
Perhaps an evident component of any trading system is being able to predict where prices will budge. And mine wasgoed no exception. I defined the current price spil the promedio of the inwards bid and inwards suggest and I set the objective of predicting where the price would be ter the next Ten seconds. My algorithm would need to come up with this prediction moment-by-moment across the trading day.
Creating & optimizing indicators
I created a handful of indicators that proved to have a meaningful capability to predict brief term price movements. Each indicator produced a number that wasgoed either positive or negative. An indicator wasgoed useful if more often than not a positive number corresponded with the market going up and a negative number corresponded with the market going down.
My system permitted mij to quickly determine how much predictive capability any indicator had so I wasgoed able to proefneming with a loterijlot of different indicators to see what worked. Many of the indicators had variables te the formulas that produced them and I wasgoed able to find the optimal values for those variables by doing side by side comparisons of results achieved with varying values.
The indicators that were most useful were all relatively elementary and were based on latest events te the market I wasgoed trading spil well spil the markets of correlated securities.
Making precies price stir predictions
Having indicators that simply predicted an up or down price movement wasn&rsquo,t enough. I needed to know exactly how much price movement wasgoed predicted by each possible value of each indicator. I needed a formula that would convert an indicator value to a price prediction.
To accomplish this I tracked predicted price moves te 50 buckets that depended on the range that the indicator value fell ter. This produced unique predictions for each bucket that I wasgoed then able to graph te Excel. Spil you can see the expected price switch increases spil the indicator value increases.
Based on a graph such spil this I wasgoed able to make a formula to gezond the curve. Te the beginning I did this &ldquo,curve fitting&rdquo, by hand but I soon wrote up some code to automate this process.
Note that not all the indicator forms had the same form. Also note the buckets were logarithmically distributed so spil to spread the gegevens points out evenly. Ultimately note that negative indicator values (and their corresponding downward price predictions) were flipped and combined with the positive values. (My algorithm treated up and down exactly the same.)
Combining indicators for a single prediction
An significant thing to consider wasgoed that each indicator wasgoed not entirely independent. I couldn&rsquo,t simply just add up all the predictions that each indicator made individually. The key wasgoed to figure out the extra predictive value that each indicator had beyond what wasgoed already predicted. This wasn&rsquo,t to hard to implement but it did mean that if I wasgoed &ldquo,curve fitting&rdquo, numerous indicators at the same time I had to be careful, switching one would effect the predictions of another.
Ter order to &ldquo,curve gezond&rdquo, all of the indicators at the same time I setup the optimizer to step only 30% of the way towards the fresh prediction kinks with each pass. With this 30% leap I found that the prediction forms would stabilize within a few passes.
With each indicator now providing us it&rsquo,s extra price prediction I could simply add them up to produce a single prediction of where the market would be te Ten seconds.
Why predicting prices is not enough
You might think that with this edge on the market I wasgoed golden. But you need to keep ter mind that the market is made up of bids and offers – it&rsquo,s not just one market price. Success te high frequency trading comes down to getting good prices and it&rsquo,s not that effortless.
The following factors make creating a profitable system difficult:
- With each trade I had to pay commissions to both my broker and the exchange.
- The spread (difference inbetween highest bid and lowest suggest) meant that if I were to simply buy and sell randomly I&rsquo,d be losing a ton of money.
- Most of the market volume wasgoed other bots that would only execute a trade with mij if they thought they had some statistical edge.
- Watching an opoffering did not assure that I could buy it. By the time my buy order got to the exchange it wasgoed very possible that that offerande would have bot cancelled.
- Spil a puny market player there wasgoed no way I could contest on speed alone.
Building a utter trading simulation
So I had a framework that permitted mij to backtest and optimize indicators. But I had to go beyond this – I needed a framework that would permit mij to backtest and optimize a total trading system, one where I wasgoed sending orders and getting ter positions. Te this case I&rsquo,d be optimizing for total P&L and to some extent media P&L vanaf trade.
This would be trickier and ter some ways unlikely to specimen exactly but I did spil best spil I could. Here are some of the issues I had to overeenkomst with:
- When an order wasgoed sent to the market te simulation I had to proefje the liggen time. The fact that my system eyed an opoffering did not mean that it could buy it straight away. The system would send the order, wait approximately 20 milliseconds and then only if the offerande wasgoed still there wasgoed it considered spil an executed trade. This wasgoed inexact because the efectivo lig time wasgoed veranderlijk and unreported.
- When I placed bids or offers I had to look at the trade execution stream (provided by the API) and use those to gauge when my order would have gotten executed against. To do this right I had to track the position of my order ter the queue. (It&rsquo,s a first-in first-out system.) Again, I couldn&rsquo,t do this flawlessly but I made a best approximation.
To refine my order execution simulation what I did wasgoed take my loom files from live trading through the API and compare them to loom files produced by simulated trading from the precies same time period. I wasgoed able to get my simulation to the point that it wasgoed pretty accurate and for the parts that were unlikely to prototype exactly I made sure to at least produce outcomes that were statistically similar (ter the metrics I thought were significant).
Making profitable trades
With an order simulation prototype ter place I could now send orders ter simulation mode and see a simulated P&L. But how would my system know when and where to buy and sell?
The price stir predictions were a beginning point but not the entire story. What I did wasgoed create a scoring system for each of Five price levels on the bid and suggest. Thesis included one level above the inwards bid (for a buy order) and one level below the inwards suggest (for a sell order).
If the score at any given price level wasgoed above a certain threshold that would mean my system should have an active bid/suggest there – below the threshold then any active orders should be cancelled. Based on this it wasgoed not uncommon that my system would flash a bid ter the market then instantaneously pantalla it. (Albeit I attempted to minimize this spil it&rsquo,s annoying spil heck to anyone looking at the screen with human eyes – including mij.)
The price level scores were calculated based on the following factors:
- The price stir prediction (that wij discussed earlier).
- The price level ter question. (Internal levels meant greater price budge predictions were required.)
- The number of contracts ter vooraanzicht of my order ter the queue. (Less wasgoed better.)
- The number of contracts behind my order ter the queue. (More wasgoed better.)
Essentially thesis factors served to identify &ldquo,safe&rdquo, places to bid/suggest. The price budge prediction alone wasgoed not adequate because it did not account for the fact that when placing a bid I wasgoed not automatically packed – I only got packed if someone sold to mij there. The reality wasgoed that the mere fact of someone selling to mij at a certain price switched the statistical odds of the trade.
The variables used ter this step were all subject to optimization. This wasgoed done te the precies same way spil I optimized variables ter the price budge indicators except te this case I wasgoed optimizing for bottom line P&L.
When trading spil humans wij often have powerful emotions and biases that can lead to less than optimal decisions. Clearly I did not want to codify thesis biases. Here are some factors my system disregarded:
- The price that a position wasgoed entered – Ter a trading office it&rsquo,s pretty common to hear conversation about the price at which someone is long or brief spil if that should effect their future decision making. While this has some validity spil part of a risk reduction strategy it indeed has no bearing on the future course of events ter the market. Therefore my program totally disregarded this information. It&rsquo,s the same concept spil overlooking buried costs.
- Going brief vs. exiting a long position – Typically a trader would have different criteria that determines where to sell a long position contra where to go brief. However from my algorithms perspective there wasgoed no reason to make a distinction. If my algorithm expected a downward stir selling wasgoed a good idea regardless of if it wasgoed presently long, brief, or vapid.
- A &ldquo,doubling up&rdquo, strategy – This is a common strategy where traders will buy more stock ter the event that there flamante trade goes against them. This results ter your media purchase price being lower and it means when (or if) the stock turns around you&rsquo,ll be set to make your money back ter no time. Ter my opinion this is truly a horrible strategy unless you&rsquo,re Warren Buffetkast. You&rsquo,re tricked into thinking you are doing well because most of your trades will be winners. The problem is when you lose you lose big. The other effect is it makes it hard to judge if you actually have an edge on the market or are just getting fortunate. Being able to pedagogo and confirm that my program did ter fact have an edge wasgoed an significant purpose.
Related movie: How to transfer BTC LTC from Coinbase and GDAX to Binance
Since my algorithm made decisions the same way regardless of where it entered a trade or if it wasgoed presently long or brief it did from time to time sit te (and take) some large losing trades (te addition to some large winning trades). But, you shouldn&rsquo,t think there wasn&rsquo,t any risk management.
To manage risk I enforced a maximum position size of Two contracts at a time, from time to time bumped up on high volume days. I also had a maximum daily loss limit to safeguard against any unexpected market conditions or a bug te my software. Thesis thresholds were enforced te my code but also te the backend through my broker. Spil it happened I never encountered any significant problems.
From the uur I embarked working on my program it took mij about 6 months before i got it to the point of profitability and begun running it live. Albeit to be fair a significant amount of time wasgoed learning a fresh programming language. Spil I worked to improve the program I eyed enhanced profits for each of the next four months.
Each week I would retrain my system based on the previous Four weeks worth of gegevens. I found this struck the right movimiento inbetween capturing latest market behavioral trends and insuring my algorithm had enough gegevens to establish meaningful patterns. Spil the training began taking more and more time I split it out so that it could be performed by 8 imaginario machines using amazon EC2. The results were then coalesced on my restringido machine.
The high point of my trading wasgoed October 2009 when I made almost 100k. After this I continued to spend the next four months attempting to improve my program despite decreased profit each month. Unluckily by this point I guess I&rsquo,d implemented all my best ideas because nothing I attempted seemed to help much.
With the frustration of not being able to make improvements and not having a sense of growth I began thinking about a fresh direction. I emailed 6 different high frequency trading firms to see if they&rsquo,d be interested te purchasing my software and hiring mij to work for them. Nobody replied. I had some fresh startup ideas I dreamed to work on so I never followed up.
UPDATE – I posted this on Hacker News and it has gotten a lotsbestemming of attention. I just want to say that I do not advocate anyone attempting to do something like this themselves now. You would need a team of truly brainy people with a range of practices to have any hope of rivaling. Even when I wasgoed doing this I believe it wasgoed very zonderling for individuals to achieve success (however I had heard of others.)
There is a comment at the top of the pagina that mentions “manipulated statistics” and refers to mij spil a &ldquo,retail investor&rdquo, that quants would &ldquo,gleefully pick off&rdquo,. This is a rather unfortunate comment that&rsquo,s simply not based ter reality. Setting that aside there&rsquo,s some interesting comments: http://news.ycombinator.com/voorwerp?id=4748624
UPDATE #Two – I&rsquo,ve posted a follow-up FAQ that answers some common questions I&rsquo,ve received from traders about this postbode.