Algorithmic Trading: Using Data Science in Finance
Algorithmic trading, often referred to as “algo” trading by those in the industry, has become a hot topic for retail traders and small investment firms. In the 1970s, large financial institutions invented and started computer-based trading to handle buying and selling financial securities. Banks and insurance companies dominated markets for centuries; in more recent times, hedge funds have claimed a significant place in the financial markets. Then, the digital revolution removed barriers to entry into the market. High-speed internet, computing power, and data science tools are now available and affordable for the broad public. With the emergence of online trading platforms/apps, trading in financial products has never been easier. Today, it requires only a few mouse clicks to trade stocks, futures, and currencies.
In this article, I’d like to give you an overview of algorithmic trading and provide a practical guide on how to start your algorithmic trading business.
Last Updated May 2022
Build your own truly Data-driven Day Trading Bot | Learn how to create, test, implement & automate unique Strategies. | By Alexander HagmannExplore Course
What is Algorithmic Trading?
Today, more than 75% of US stock trades are placed by computer algorithms, not humans. This figure has been expanding over time and will continue to do so. There is no single definition of “algorithmic trading.” Depending on their background, different people mean different things. At its most basic, an algorithm is a “sequence of steps to achieve a goal.” The CFA Institute defines algorithmic trading as “using a computer to automate a trading strategy.”. Computer programmers have created many different algorithmic trading strategies used by traders every day. Irrespective of the specific strategy, algorithmic trading has two major aspects:
- Algorithms are initiated by humans and follow a clear strategy to reach specific goals. Initially, algorithms had been pre-programmed rules. These rules, developed by programmers, are based on mathematical and statistical models. The emergence of artificial intelligence and machine learning introduced data-driven and self-learning algorithms. Consequently, the job profile for algorithmic traders has changed. Data science and data engineering skills have become much more relevant.
- Trades are automated. Computers place and execute orders, not humans. Unlike the human brain, computers can process large amounts of data with ease. They can make thousands of trading decisions within microseconds.
Major Applications and Use Cases
The three major use cases of algorithmic trading are:
- Execution algorithms
- Portfolio rebalancing algorithms
- High-frequency trading algorithms
For years, algorithmic trading had been synonymous with execution-only algorithms (broker algorithms). Large institutions use execution algorithms to break down large orders. These smaller orders are then executed over time. The goal is to reduce the impact that a large order has on the market. With this, traders can achieve a benchmarked price at low trading costs. Examples of execution algorithms are “volume-weighted average price (VWAP)” and “implementation shortfall” algorithms. Execution algorithms are standard tools for brokers and large institutions. They play a minor part for retail traders.
Portfolio Rebalancing Algorithms and Robo Investing
Institutional investors have target weights for assets and asset classes. As time goes by and markets move, weights of portfolio constituents slip away. That’s why portfolio rebalancing is a critical workflow. In simple words, rebalancing algorithms sell “winners” and buy “losers” to restate their target weights. Performance goals and regulatory constraints are the driving factors for target weights. Insurance and pension plans are regulated investors. They need to comply with strict limits. One example could be having no more than 40% stock investments at any time. Automated monitoring and automated trading systems play a pivotal role in achieving this.
Robo investing is a recent trend in the market. It democratizes professional investing and professional investment advising. Robo advisers offer a variety of services to retail investors, such as:
- Determining an appropriate risk-return profile
- Asset allocation
- Portfolio optimization
- Portfolio management and monitoring
- Portfolio rebalancing
Robo advisors provide these services with minimal human intervention.
High-Frequency Trading Algorithms (Market Timing)
High-frequency trading (HFT) algorithms are about profit. They are also called “alpha-generating strategies.” The “high-frequency” refers to:
- Tracking of high-frequency streams of data (such as market data feeds or news feeds)
- Identifying patterns and trading opportunities in the data
- Making trading decisions based on those patterns
- Automatically placing and executing orders to capitalize on those opportunities
HFT is about “what to trade” and “when to trade” (market timing). Most HFT strategies work independently from general market trends. Traders take long and short positions to benefit from temporal mispricings in the markets. Typically, they are buying and selling multiple times within the same trading day – often referred to as “day trading”. For years, retail day traders were relying on intuition and gut feeling. The recent trend is towards HFT algorithms (“algorithmic day trading”). HFT strategies can stem from:
- Fundamental Data
Fundamental data includes company revenues, earnings, profits, margins, and other news. It allows traders to determine a stock’s underlying value. Then they can decide whether a stock is overvalued or undervalued. Likewise, they can use fundamental economic data to identify mispriced currencies and indices. Macroeconomic data include interest rates, inflation, and unemployment rates.
- Technical Analysis/Technical Indicators
Technical traders try to find patterns and trends in historical price and volume data. They can use those patterns and trends to predict future prices and returns. Simple moving averages (SMA) is an example of a technical indicator. There are also more complex charting techniques like Elliott Wave Patterns. For more information on Technical Analysis with Python, check out this course.
- Machine Learning/Artificial Intelligence
For many years, trading strategies were rules-based and pre-programmed. Traders translated observed patterns into simple if-then rules. The emergence of artificial intelligence (AI) and machine learning (ML) was a game-changer. Both allow traders to detect more complex patterns and hidden relationships in the market. ML algorithms are self-learning algorithms. Traders feed those algorithms with market data, then the algorithms find patterns in the data and predict future prices and returns. They automatically learn and improve with more data. Traders can feed ML models with all kinds of data, including:
- Fundamental data
- Historical prices
- Historical trading volumes
- Technical indicators
- Statistical Arbitrage
Statistical arbitrage (or “stat arb”) strategies typically include two or more financial instruments. They monitor correlated instruments to detect breaks in the correlation. If the relationship breaks for a short period, there is an opportunity to buy one and sell the other at a profit. There are a variety of stat arb HFT algorithms, including Pairs Trading and Index Arbitrage.
Algorithmic (Day) Trading for Beginners: The Life-Cycle of a Trading Algorithm
Developing, implementing, and maintaining a profitable HFT algorithm is a structured process. There are many steps, and most of them need human intervention and judgment.
First, the trader must determine those financial instruments that they want to trade. The more familiar the trader with an instrument, the better. Depending on the strategy, some instruments are more suitable than others. Currencies are popular among HFT day traders as spreads and commissions are low and price volatility is high. Second, the trader needs to find the right online trading platform. The platform should offer applications programming interface (API) trading. This allows traders to interact with the online broker programmatically. That means traders can stream market data and make orders with programming languages like Python.
Next, the trader pulls historical data via the Broker API. Brokers typically provide price and volume data. In some cases, traders can access fundamental data as well. If necessary, the trader needs to identify additional sources for fundamental data. Exploratory data analysis (EDA) is a key workflow in any data project. It allows analysts to gain a deeper understanding of the underlying data. Data analysis includes:
- Data visualization
- Detecting missing or corrupted data and other inconsistencies
- Cleaning, formatting, aggregating, and reshaping the data
- Saving the final data locally/online
Identifying Trading Opportunities | Formulating a Strategy | Strategy Testing
The next three steps go hand in hand. Depending on the strategy, these steps are executed simultaneously or one after another. The trader starts with a rough idea of what a profitable strategy could look like. Then, the strategy is defined with a programming language.
The following step is the most critical in the whole process. The trader must test the performance of the strategy on the data at hand. This is called backtesting. The performance includes risk and return/profit metrics. Promising strategies have the following characteristics:
- They are highly profitable.
- They have an acceptable risk profile (the lower, the better).
- They should beat the benchmark.
The benchmark is a comparable instrument or strategy. Often, a simple buy-and-hold strategy is the best benchmark.
It’s easy to find strategies that are profitable before trading costs. The challenge is to find profitable strategies after trading costs. The nature of trading costs can differ. Some brokers offer tight spreads but charge commissions. Other brokers are commission-free with wider spreads. It’s important to understand that each trade triggers costs, and traders have to include them in the strategy definition. Backtesting results are meaningless if traders ignore costs.
Some strategies are optimized with the underlying data. ML strategies are a good example. ML models should fit the sample data. One drawback is that many fitted strategies tend to overfit. Overfitted strategies seem to be profitable on the data at hand (“in-sample”), but they fail to generate profits in the future (“out-sample”). In other words, overfitting strategies don’t generalize to new data. Forward testing, also known as out-sample backtesting, is a helpful tool to identify overfitting. The trader tests the strategy on new data that the strategy has not seen before. When overfitting could be an issue, it’s common practice to split the dataset. The trader defines and optimizes the strategy on the training set, then forward tests the strategy on the test set.
In conclusion, the trader should continue only if the strategy performs well after trading costs and that performance is confirmed by forward testing.
The next step after data analysis is to write an implementation algorithm that is different from the backtesting algorithm. An implementation algorithm must:
- Stream real-time (tick) data
- Convert tick data into trading signals
- Communicate with the online broker
- Place and execute orders
- Report trades and monitor performance
Much can go wrong at this point, so it’s important to test the implementation in a simulated environment. Many brokers offer practice accounts that allow paper trading. There is zero risk of losing money in a paper trading session.
Going Live/Real Trading
When an algorithm goes live, it receives real market data and places live orders in the market. At this point, it is “in-production.” An in-production algorithm can produce profits and losses.
Live trading requires a stable and reliable technical infrastructure. Retail traders often use their local desktop computers and place and execute orders through their local internet connection. We wouldn’t recommend this. Algorithmic trading sessions can last for many hours. Technical issues can harm and/or stop trading sessions, which can lead to high losses. Thus, it’s best practice to deploy trading algorithms on a virtual (cloud) server. The benefits are:
- High availability/reliability of hardware and web connection (99.9% or higher)
- Customizable and scalable performance (CPU, RAM, connectivity)
- Trading sessions are autonomous. They can be scheduled and automated independently from local devices.
There is no guarantee that an in-production strategy will generate profits, and many strategies deteriorate over time. Thus, it’s important to closely monitor performance. If live performance confirms backtesting results, the strategy can stay in-production (continue). Otherwise, the trader needs to update and/or improve the strategy. If the strategy remains unprofitable, the trader should scrap it. Additionally, the trader should update their plan with the most recent data. This applies in particular to ML algorithms.
Algorithmic (Day) Trading for Beginners: Tools, Infrastructure, and Required Skills
Setting up a technical infrastructure for algorithmic trading isn´t costly. Retail traders can use their existing infrastructure at home or at work. This includes customary desktop computers and internet connections. With a few exceptions, this is sufficient to define and backtest a variety of strategies. As set out above, traders should deploy in-production algorithms on a virtual server. Providers for cloud and storage services are, among others, Amazon Web Services (AWS) and Microsoft Azure. Monthly prices start as low as USD 10 for basic solutions. For some instruments, traders need to subscribe to real-time streaming market data. Depending on the broker, market data fees start at USD 5 per month. In total, ambitious traders should calculate with:
- An initial investment close to USD 0 (assuming a desktop computer and internet is available)
- Monthly fixed costs of up to USD 100 for cloud and market data services.
Obtaining the required skills is by far the most important investment. An algorithmic trader must be proficient in at least one programming language. Traders have the choice between Python and C++.
Remember, speed matters in algorithmic trading. C++ is faster than Python, but C++ is a lot more complex to learn. Python is much more beginner-friendly as Python code is easy to read and understand. Even more importantly, Python is the top programming language for data science and machine learning. Being skilled in it is becoming increasingly relevant for traders. Python allows beginners to build more complex and powerful algorithms in less time. This compensates for the lack of trade execution speed.
For more information on algorithmic trading with Python, check out this course.
Top courses in Algorithmic Trading
Algorithmic Trading students also learn
Empower your team. Lead the industry.
Get a subscription to a library of online courses and digital learning tools for your organization with Udemy for Business.