SAP HANA and Algorithmic Trading: Every Millisecond Counts
It is said that IT is conquering Wall Street by means of robots using algorithmic trading. Some claim that algorithmic trading systems have unfair advantages through high speed and low latency. High-frequency trading (HFT) is probably the most well known algorithmic trading type, but the term generally covers any programmed trading. Top quant trading firms, such as Renaissance Technologies, are storing terabytes of data every day. All messages from equity (stock) exchanges represent several terabyte per day. Big data and performance makes me think of in-memory data management. In this blog, I will provide my view on how Capital Market trading systems can benefit from using SAP HANA.
The Black Box
Let’s start by having a look at the typical components of a trading system, also referred to as the Black Box. The model is taken from the excellent book “Inside the Black Box” by Rishi Narrang. In the following, each part is described briefly with an aim to make it understandable for IT geeks with no finance background. To get more details, read the book.
Figure 1, 2013, Rishi Narrang, Inside The Black Box 2nd Edition, Wiley, 171 p.
The Alpha model uses a strategy to determine what securities to buy. There are basically two types of alpha model; fundamental analysis (http://en.wikipedia.org/wiki/Fundamental_analysis) and technical analysis (http://en.wikipedia.org/wiki/Technical_analysis). Extremely simple example strategies could be to trade stocks with a positive price-earnings (P/E) ratio below 10 (~fundamental analysis) or that the price is below the 200-day moving average price (~technical analysis).
Risk model is about selecting and sizing risk exposures of an alpha model. Assume that the above mentioned alpha model only finds one stock to buy, would it be right to put all eggs in one basket? No, probably not. Or lets assume that the vast majority of stocks found are from the same industry. Would it be wise to invest everything in one industry? Think about the IT industry at the beginning of the millenium and you will have the answer to that question. Risk models can, among many others things, limit the relative stock position size to the overall portfolio and the exposure to a single industry.
Transactional cost model consist of three main areas. Commissions and fees, slippage and market impact. In general commissions and fees should be rather self explanatory (although there are some complicated details in this area too), but slippage and market impact might need some more explanation. Slippage is when the price of a stock has changed from the moment a trade is placed to when it actually executed. The slippage can be negative or positive. Market impact refers to the impact a trade can have on the price of a stock. Anyone who has traded illiquid stocks, often small caps, know that even minor orders can impact the price significantly.
Portfolio Construction Model combines the alpha, risk, and transactional cost model to determine a target portfolio. The gap between the existing and target portfolio implies what and and how much to trade. Needless to say that the portfolio construction model is the most important and central part of the black box. Without a proper strategy the other parts of the black box are meaningless and add no additional business value.
Execution: And … action! Execution is when the planned trades gets carried out on the marketplaces. It is about order fulfillment at the best price. On first-come first-serve marketplaces, it is apparent that speed is a key component of successful trading. However, speed is not the only important factor, often other factors such as order size come into play. That is why there is a need of smart algorithms to make most out of the varying marketplace rules.
Data is the input to the black box. Without proper input there will be no valuable output. This can either be price related information or fundamental data. Examples of fundamental data can be earnings reports or unemployment rates. Above all else, it is important that the data is correct and must come from trustworthy sources (if you have seen the movie Trading Places, starring Dan Aykroyd and Eddie Murphy, then you know what can happen when data is not accurate and reliable).
After having shed some light on the components of a trading system, it is time to briefly look at how a trade is made from an IT perspective. I call it the trading round trip.
Trading Round Trip
No matter if you are a human being or an automatic trading system, the first thing you do is to collect information. Assuming that the data is reliable, as mentioned in the previous section, it is important to get the information as quick as possible. Knowing something before anyone else, is a key competitive advantage. A speedy infrastructure to the data sources must therefore be obtained. But speed is not enough, it must also be robust so that no data is missing. This is especially important for trading systems that need to keep track of exchange order books.
Information can come from varying sources and needs to be transformed into formats that are comparable and processable. As the amounts of data per day represent numerous terabytes, smart data management can lead to better performance and lower hardware costs. For example, storing information in a columnar fashion can enable data compression and speed up the performance.
After handling the data, it is time to process the information by using the portfolio construction model to find out what and how much to trade. Strategy implementation and performance are two main aspects of the processing. Optimizing the calculation speed can be achieved in many ways. A good starting point is to ensure that the code is run on a speedy platform with support for parallelization. But remember, bad code kills good platforms. It is therefore wise to check the code for performance bottlenecks. This is not always just a source code exercise, in most cases it is about balancing the requirements with the performance impact. Another non-runtime aspect is proper code management, which is crucial for avoiding bugs in application management and further development. On August 1, 2012, Knight Capital Americas LLC almost went bust due to a bug in their trading system.
Execution requires both sophisticated trading algorithms and complex infrastructures, that must be very robust and swift, to enable automatic trading. Although the trading algorithms are sophisticated they generally operate on relatively little data and require less processing compared to the above mentioned strategy calculations.
Throughout the trading round trip every millisecond counts. That said, the timing perspective is somewhat different for HFT and long-term algorithmic trading.
HFT and Long-Term Algorithmic Trading
HFT is about arbitrage and market making. Arbitrage opportunities occur when there are irregularities in the marketplaces. For example, lets assume that a stock, like SAP, is listed on both New York and Frankfurt stock exchange. All else being equal, the stock price should match as it is the same company you buy in both marketplaces. Imagine somebody places a very large buy order in New York that will make the stock soar. This could lead to a very short period where the stock price differs in New York and Frankfurt. An arbitrageur would then quickly trade the stock in Frankfurt knowing that the price is higher in New York.
Being a market maker is a bit like a used car dealership. The main business idea is to buy cheap and sell expensive. Market makers uses the spread to achieve this. People often sell their cars to dealerships at a discount to get money quickly. In much the same way, an HFT market maker ensures liquidity by taking the other side of an order. A difference between the car dealership and HFT, however, is that there must be no stock at the end of the day. The deal is often made in milliseconds to avoid taking any additional risk. An HFT trading round trip is extremely iterative in the sense that trades made have direct impact on the next coming trade and so on. Speed is probably the most important competitive advantage in HFT and therefore many compromises must be made to optimize the speed factor. For example, the alpha model should be simple and use limited input data. The risk and transactional cost model might be ignored during run time.
Long-term algorithmic trading is often based on fundamental analysis and is happening less frequent than HFT, such as for example earnings reports. However, once the fundamental data is updated, it is important to act quickly upon the changes. The trades are usually larger than HFT and therefore requires swift and smart execution. Unlike HFT, only few compromises, if any, must be made regarding the portfolio construction model. This is therefore a typical example where big data meets high speed requirements.
Runtime and Research
So far I have focused mainly on the runtime perspective of trading. There is another area that deserves mentioning here – research. Research is the brain of black box trading. This is where quants spend most time to come up with and (back) test their trading ideas. Although some trading systems do use machine learning to make trading smarter, it is still human beings who are defining the strategies, e.g. how the black boxes are programmed. Research requires reliable, structured, flexible, and massive amounts of data. The tools should be intuitive, flexible, and have acceptable response times.
You made it this far, a little SAP HANA is right around the corner!
In short SAP HANA is (SAP HANA Developer Guide Version: 1.2, 03-09-2013):
- In-Memory Database that runs on multi-core CPUs with fast communication between processor cores and contains terabytes of main memory
- Optimized for columnar data storage
- Allows highly efficient compression
- Columnar storage, in many cases, eliminates the need for additional index structures. Storing data in columns is functionally similar to having a built-in index for each column
- Designed to perform its basic calculations, such as analytic joins, scans and aggregations in parallel. Often it uses hundreds of cores at the same time, fully utilizing the available computing resources of distributed systems
- Simplifying applications by eliminating materialized aggregates. With a scanning speed of several gigabytes per millisecond, SAP HANA makes it possible to calculate aggregates on large amounts of data on-the-fly with high performance
- The controller of the Model-View-Controller (MVC) pattern can reside in SAP HANA and thereby greatly decrease the overhead of moving data between the application server and database
- Provides comprehensive development tools
Mapping the SAP HANA features to the trading round trip is depicted in Figure 3.
Feature 1 and 2 fit perfectly to the data storage part of data management. For conversion, it is an advantage that there is a broad and accessible knowledge base about migrating huge amounts of data to SAP HANA. The compression implies lower hardware costs. Feature 3,4,5, and 6 are all about the processing from both a design-time and runtime perspective. As the execution part is operating on relatively small amounts of data, I find it doubtable that SAP HANA can provide any additional benefit compared to highly specialized trading code. SAP HANA does not come with any specific marketplace infrastructure advantages that a trading system can benefit from, which then also rules out collecting.
Will we see SAP HANA as an execution platform? Most probably not. Is SAP HANA going to be a part of HFT trading systems? Not likely, but maybe for the research part of it. Can mid- and low-frequency trading systems benefit from SAP HANA? Yes, definitely, both at runtime and for research, as it is a superior standard platform for processing and data management. Big data will make it possible to analyze unstructured information in ways formerly impossible. This could introduce new ratios (for example positive/negative posts in social media regarding small cap stock), or patterns found via data mining, that might lead to entirely new successful strategies yet unknown.
Many analyst tools rely heavily on collecting, management, and processing of data. For these tools, SAP HANA should be a platform to consider.
On SCN I usually focus on ABAP development. My followers on Twitter, however, will know that I am a leisure quant looking for property in Graham and Doddsville. I think my DNA contains a value gene. Reading Thorsten Franz blog Ingredients for Magic inspired me to write this blog, which is comprising two of my main ingredients. SAP and securities.