How to Build Your First Mean Reversion Trading Strategy In Python

The beautiful thing about markets is they always move. They may not show strong momentum or trends at any given time, but that doesn’t mean you can’t make money off of a strategy. You can ride the waves of the price action up and down — even on top of a larger trend — to make money and show healthy returns.

TL;DR

We’ll build a mean reversion strategy that buys when a security reaches an extreme low, and sells when it reaches an extreme high. Properly tuned, this will allow us to extract returns from most any market regime.

Market Oscillations

Have you ever looked back at a stock and found you held on too long? You should have sold at the top, but instead thought the stock had higher to go? Or, have you waited just a bit too long to get into a trade and missed out on the bottom?

These kinds of mistakes are what oscillating or mean reversion strategies seek to address. They are designed to find those points where a market is over-extended one way or another so you can make your move. There are a whole host of oscillating indicators like this. For now, we’ll take a look at a basic model — our old simple moving average (SMA) — to serve as our indicator to buy, sell, and short a security.

To begin, we’ll show how you can do this easily in Python. So start with importing some of the basic packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf

To run this strategy, we’ll look at the SMA and see if the price is too high or low compared to the SMA. If the price is too low, we’ll buy it with the expectation that the price will go higher, towards the moving average. If it’s over extended to the upside, then we sell or go short, again with the expectation that the price is going to drop in the near term.

def SMAMeanReversion(ticker, sma, threshold, shorts=False,
    start_date='2000-01-01', end_date='2020-12-31'):
    yfObj = yf.Ticker(ticker)
    data = yfObj.history(start=start_date, end=end_date)
    data['SMA'] = data['Close'].rolling(sma).mean()
    data['extension'] = (data['Close'] - data['SMA']) / data['SMA']
    
    data['position'] = np.nan
    data['position'] = np.where(data['extension']<-threshold,
        1, data['position'])
    if shorts:
        data['position'] = np.where(
            data['extension']>threshold, -1, data['position'])
        
    data['position'] = np.where(np.abs(data['extension'])<0.01,
        0, data['position'])
    data['position'] = data['position'].ffill().fillna(0)
    
    # Calculate returns and statistics
    data['returns'] = data['Close'] / data['Close'].shift(1)
    data['log_returns'] = np.log(data['returns'])
    data['strat_returns'] = data['position'].shift(1) * \
        data['returns']
    data['strat_log_returns'] = data['position'].shift(1) * \
        data['log_returns']
    data['cum_returns'] = np.exp(data['log_returns'].cumsum())
    data['strat_cum_returns']  =   
        np.exp(data['strat_log_returns'].cumsum())
    data['peak'] = data['cum_returns'].cummax()
    data['strat_peak'] = data['strat_cum_returns'].cummax()
    
    return data.dropna()

In the function above, we supply a ticker, our SMA period, the threshold, and whether or not we want to include short positions. With that info, we use yfinance to get the data and apply our moving average. We calculate our extension to see whether a movement is over/under extended by calculating the move as a percentage of our SMA indicator.

We take a long position when we’re over extended to the low side, i.e. the price minus the SMA is less than the threshold, because we’re expecting the stock to mean revert and head higher. If it moves too far upward, we go short, or just stay neutral without a position. Because we expect things to revert to the mean, we will also exit our position if it comes very close to the mean (within 1% in this case).

We can test this with some different tickers to see how it performs. We’ll use the getStratStats function we defined in a previous article to evaluate the results.

ticker = 'AAL'
SMA = 50
threshold = 0.1
shorts = False
data = SMAMeanReversion(ticker, SMA, threshold, shorts)
stats_dict = getStratStats(data)
df_stats = pd.DataFrame(stats_dict).round(3)
df_stats

Running this on American Airlines with a 50-day SMA and a 10% extension threshold, we outperform the buy and hold by a significant margin.

We can also view the extension and the thresholds, along with our positions over time.

colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
fig, ax = plt.subplots(3, figsize=(10, 8), sharex=True)
long = data.loc[data['position']==1]['Close']
ax[0].plot(data['Close'], label='Price', linestyle=':', color=colors[1])
ax[0].plot(data['SMA'], label='SMA', linestyle='--', color=colors[0])
ax[0].scatter(long.index, long, label='Long', c=colors[2])
ax[0].legend(bbox_to_anchor=[1, 0.75])
ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'{ticker} Price and Positions with {SMA}-Day Moving Average')
ax[1].plot(data['extension']*100, label='Extension', color=colors[0])
ax[1].axhline(threshold*100, linestyle='--', color=colors[1])
ax[1].axhline(-threshold*100, label='Threshold', linestyle='--', color=colors[1])
ax[1].axhline(0, label='Neutral', linestyle=':', color='k')
ax[1].set_title('Price Extension and Buy/Sell Thresholds')
ax[1].set_ylabel(f'Extension (%)')
ax[1].legend(bbox_to_anchor=[1, 0.75])
ax[2].plot(data['position'])
ax[2].set_xlabel('Date')
ax[2].set_title('Position')
ax[2].set_yticks([-1, 0, 1])
ax[2].set_yticklabels(['Short', 'Neutral', 'Long'])
plt.tight_layout()
plt.show()

Our long-term cumulative returns show that the strategy was much better the buy-and-hold baseline for this stock over the long run, even if it had periods of underperformance.

Adding some Protection

Unfortunately, this strategy is susceptible to losses during large, downward moves. This can be seen clearly in the plot above where the model takes a large hit during the 2008 crisis and the more recent 2020 COVID crash. The model winds up being on the exact wrong position in these times because it decides to go long in the face of increased selling.

To address this, we can include a safety threshold, a point at which the model has become too extended so that momentum dominates over mean reversion and we should exit the position.

def SMAMeanReversionSafety(ticker, sma, threshold, 
    safety_threshold=0.25, shorts=False, 
    start_date='2000-01-01', end_date='2020-12-31'):
    yfObj = yf.Ticker(ticker)
    data = yfObj.history(start=start_date, end=end_date)
    data['SMA'] = data['Close'].rolling(sma).mean()
    data['extension'] = (data['Close'] - data['SMA']) / data['SMA']
    
    data['position'] = np.nan
    data['position'] = np.where(
        (data['extension']<-threshold) & 
        (data['extension']>-safety_threshold), 
        1, data['position'])
    
    if shorts:
        data['position'] = np.where(
            (data['extension']>threshold) & 
            (data['extension']<safety_threshold),
            -1, data['position'])
        
    data['position'] = np.where(np.abs(data['extension'])<0.01,
        0, data['position'])
    data['position'] = data['position'].ffill().fillna(0)
    
    # Calculate returns and statistics
    data['returns'] = data['Close'] / data['Close'].shift(1)
    data['log_returns'] = np.log(data['returns'])
    data['strat_returns'] = data['position'].shift(1) * \
        data['returns']
    data['strat_log_returns'] = data['position'].shift(1) * \
        data['log_returns']
    data['cum_returns'] = np.exp(data['log_returns'].cumsum())
    data['strat_cum_returns'] = 
        np.exp(data['strat_log_returns'].cumsum())
    data['peak'] = data['cum_returns'].cummax()
    data['strat_peak'] = data['strat_cum_returns'].cummax()
    
    return data.dropna()

We can include this by updating the np.where arguments to include both the threshold and the safety_threshold variables. So if the extension value is between those bounds, we buy or go short, otherwise we're neutral. Let's see how these perform together.

ticker = 'AAL'
SMA = 50
threshold = 0.1
safety_threshold = 0.15
shorts = False
data = SMAMeanReversion(ticker, SMA, threshold, shorts)
data_safe = SMAMeanReversionSafety(ticker, SMA, threshold, safety_threshold, shorts)
safe_stats_dict = getStratStats(data_safe)
df_safe_stats = pd.DataFrame(safe_stats_dict).round(3)
fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(data_safe['strat_cum_returns'], label='Mean Reversion Strategy with Safety')
ax.plot(data['strat_cum_returns'], label='Mean Reversion Strategy')
ax.plot(data_safe['cum_returns'], label=f'{ticker}')
ax.set_xlabel('Date')
ax.set_ylabel('Returns (%)')
ax.set_title('Cumulative Returns for Mean Reversion and Buy and Hold Strategies')
ax.legend()
plt.show()

In this case, the safety threshold model outperforms the mean reversion strategy without the safety parameter nearly every step of the way. At the end, it’s able to cut its losses during big drawdowns like the COVID crash to maintain capital and leave the other strategies in the dust with a sideways moving stock.

Adding this safety feature not only boosted total returns, but reduced the annual volatility of the strategy, increased the Sharpe ratio, and made drawdowns less severe (exactly as we had hoped).

df_safe_stats.columns = ['Mean Reversion with Safety', 'Buy and Hold']
df_stats.columns = ['Mean Reversion', 'x']
df_stats = pd.concat([df_stats.T, df_safe_stats.T])
df_stats.drop('x', axis=0, inplace=True)
df_stats

Looking at 2020, you can see how this safety feature winds up avoiding the worst of the crash.

import calendar
ticks = [pd.to_datetime(f'2020-{i}-01') for i in np.arange(1, 13)]
cr_mr_safe = np.exp(data_safe.loc[data_safe.index>=ticks[0]]['strat_log_returns'].cumsum())
cr_mr = np.exp(data.loc[data.index>=ticks[0]]['strat_log_returns'].cumsum())
cr_base = np.exp(data.loc[data.index>=ticks[0]]['log_returns'].cumsum())
fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(cr_mr_safe, label='Mean Reversion Strategy with Safety')
ax.plot(cr_mr, label='Mean Reversion Strategy')
ax.plot(cr_base, label=f'{ticker}')
ax.set_xlabel('Date')
ax.set_ylabel('Returns (%)')
ax.set_title('Cumulative Returns for Mean Reversion and Buy and Hold Strategies')
ax.set_xlim([pd.to_datetime('2020-01-01'), data.index[-1]])
ax.set_xticks(ticks)
ax.set_xticklabels([i for i in calendar.month_abbr if i is not ''])
ax.legend()
plt.show()

Both mean reverting models go into the year neutral, however, the standard mean reversion model takes a long position when the crash begins and stays long until the bottom, then gets out for the bump that came in June. It catches up thanks to some later short positions, to end the year above the long-only strategy, but down for the year. The mean reversion with safety model doesn’t do much differently apart from miss that big drawdown. This capital preservation move allows it to have more to work with and wait for the bottom to hit before opening a position. This, of course, compounds throughout the year making it a much more effective strategy over for this stock.

Bottom Line

Mean reversion strategies try to take into account the natural tendency of stocks to go back to a long run average. This is a great way to extract gains from choppy and sideways-moving markets.

The example above is a good start and gives you a feel for some of the important parameters with this model, but it is limited. It doesn’t take into account dividends, transaction fees, or run a proper, walk-forward optimization to choose parameters.

To get all of this with a hardened event-based backtest system, come check us out at Raposa. We’re building trading platforms for individuals to give them best-in-class experience and data. Sign up below to learn more.

How to Build your First Mean Reversion Trading Strategy in Python

TL;DR

Market Oscillations

Adding some Protection

Bottom Line