## How to Trade the MACD: Four Strategies with Backtests

The Moving Average Convergence-Divergence (MACD) is a popular and versatile indicator that appears in a number of trading systems. In it’s most basic form, we have the difference between two exponential moving averages (EMA), one fast and the other slow. The MACD is the difference between these two EMAs. From this simple beginning a host of other indicators such as signal lines and MACD bars are built. We’ll show you how to implement each of these, but refer back to this article for a thorough explanation.

In this post, we’re going to implement and backtest three versions of the MACD in Python to illustrate how to trade it and discuss areas for improvement.

Of course, if you want to skip all the code, you can try our new platform for free here to test these ideas.

# Mean-Reverting MACD

A good place to get started with the MACD is to see when the signal diverges, i.e. when the MACD moves far away from 0. A mean-reverting strategy can quickly be built on top of this.

Our strategy is going to use standard parameters for the MACD. Our fast EMA will look over the past 12 days, and the slow EMA will run over 26 days. The model is going to buy our stock when the price breaks below a certain threshold, and will sell when the MACD converges back to 0. If the MACD runs high, we’ll short the stock and sell when it gets back to 0. We’re simply trying to jump on large, outlier movements in the price with the hope that the price will move back towards the longer EMA.

To get going, fire up Python and import the following packages.

import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt


To start, we need to calculate the MACD. As stated above, the MACD is built on top of the EMA, so we’re going to write a few functions to calculate the EMA, one to plug in the values, the other that will initialize it and apply it to our data frame. From there, we can write our MACD function that will take these parameters, calculate the EMAs, our MACD, and fill in our data frame.

The code for the following functions was taken from this, previous post on the MACD and will go through all of the details on the code and calculations.

def _calcEMA(P, last_ema, N):
return (P - last_ema) * (2 / (N + 1)) + last_ema

def calcEMA(data: pd.DataFrame, N: int, key: str = 'Close'):
# Initialize series
data['SMA_' + str(N)] = data[key].rolling(N).mean()
ema = np.zeros(len(data)) + np.nan
for i, _row in enumerate(data.iterrows()):
row = _row[1]
if np.isnan(ema[i-1]):
ema[i] = row['SMA_' + str(N)]
else:
ema[i] = _calcEMA(row[key], ema[i-1], N)

data['EMA_' + str(N)] = ema.copy()
return data

def calcMACD(data: pd.DataFrame, N_fast: int, N_slow: int):
assert N_fast < N_slow,
("Fast EMA must be less than slow EMA parameter.")
data = calcEMA(data, N_fast)
data = calcEMA(data, N_slow)
# Subtract values to get MACD
data['MACD'] = data[f'EMA_{N_fast}'] - data[f'EMA_{N_slow}']
# Drop extra columns
data.drop(['Open', 'High', 'Low', 'Volume',
'Dividends', 'Stock Splits'], axis=1, inplace=True)
return data


Now with the MACD in place, we’re going to write two more functions. One is going to be our strategy function, and the other will calculate all of our returns. We’re breaking this second one into its own function because we can re-use the code later when we write other MACD strategies.

The MACDReversionStrategy is where our strategy is implemented (if you didn’t guess that by the name). This is a simple, vectorized implementation that is just going to look for our level value, and trade if the MACD moves outside of its bounds.

def calcReturns(df):
df['returns'] = df['Close'] / df['Close'].shift(1)
df['log_returns'] = np.log(df['returns'])
df['strat_returns'] = df['position'].shift(1) * df['returns']
df['strat_log_returns'] = df['position'].shift(1) * \
df['log_returns']
df['cum_returns'] = np.exp(df['log_returns'].cumsum()) - 1
df['strat_cum_returns'] = np.exp(
df['strat_log_returns'].cumsum()) - 1

return df

def MACDReversionStrategy(data, N_fast=12, N_slow=26,
level=1, shorts=True):
df = calcMACD(data, N_fast, N_slow)
# Drop extra columns
df.drop(['Open', 'High', 'Low', 'Volume',
'Dividends', 'Stock Splits'], axis=1, inplace=True)
df['position'] = np.nan
df['position'] = np.where(df['MACD']<level, 1, df['position'])
if shorts:
df['position'] = np.where(df['MACD']>level, -1, df['position'])

df['position'] = np.where(df['MACD'].shift(1)/df['MACD']<0,
0, df['position'])
df['position'] = df['position'].ffill().fillna(0)

return calcReturns(df)


The next step is to get data. We’ll rely on the yfinance package to get some free data from Yahoo! Finance. For backtests, more data is better, so we’re going to grab an older company so we can see how this performs over a long, multi-decade time horizon, so let’s choose 3M (ticker: MMM) and see how this strategy works.

ticker = 'MMM'
start = '2000-01-01'
end = '2020-12-31'
yfObj = yf.Ticker(ticker)
df = yfObj.history(start=start, end=end)
N_fast = 12
N_slow = 26
df_reversion = MACDReversionStrategy(df.copy(),
N_fast=N_fast, N_slow=N_slow, level=1)
fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
ax[0].plot(df_reversion['strat_cum_returns']*100,
label='MACD Reversion')
ax[0].plot(df_reversion['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
'Cumulative Returns for MACD Reversion Strategy' +
f'and Buy and Hold for {ticker}')
ax[0].legend()
ax[1].plot(df_reversion['MACD'])
ax[1].axhline(level, label='Short Level', color=colors[1],
linestyle='--')
ax[1].axhline(-level, label='Long Level', color=colors[1],
linestyle='--')
ax[1].axhline(0, label='Neutral', color='k', linestyle='--')
ax[1].set_xlabel('Date')
ax[1].set_ylabel('MACD')
ax[1].set_title(f'{N_fast}/{N_slow} MACD for {ticker}')
plt.tight_layout()
plt.show()


Our MACD mean reversion strategy outpaces the underlying stock from 2000–2016, before the strategy begins giving back some value and then underperforms for a few years. Starting in 2018 and continuing through the end of 2020, however, the strategy simply takes off!

One thing to notice about this model, is that the range of values for the MACD increases over time. It’s likely the case that the signal value we chose (-1 and 1 for simplicity) aren’t optimal and an adaptive value may prove better. The drawback of this is that we now have another parameter we need to fit, which can lead to overfitting our results.

# MACD with Signal Line

This next method relies on the MACD signal line. This is just the EMA of the MACD itself. Because of this, we can re-use a lot of our code from the previous functions to add the signal line to our data.

def calcMACDSignal(data: pd.DataFrame, N_fast: int, N_slow: int,
N_sl: int = 9):
data = calcMACD(data, N_fast=N_fast, N_slow=N_slow)
data = calcEMA(data, N_sl, key='MACD')
# Rename columns
data.rename(
columns={f'SMA_{N_sl}': f'SMA_MACD_{N_sl}',
f'EMA_{N_sl}': f'SignalLine_{N_sl}'}, inplace=True)
return data


A common way to trade the MACD with the signal line, is to buy when the MACD is above the signal line, and short or sell if the lines cross again.

def MACDSignalStrategy(data, N_fast=12, N_slow=26, N_sl=9,
shorts=False):
df = calcMACDSignal(data, N_fast, N_slow, N_sl)
df['position'] = np.nan
df['position'] = np.where(df['MACD']>df[f'SignalLine_{N_sl}'], 1,
df['position'])

if shorts:
df['position'] = np.where(df['MACD']<df[f'SignalLine_{N_sl}'],
-1, df['position'])
else:
df['position'] = np.where(df['MACD']<df[f'SignalLine_{N_sl}'],
0, df['position'])
df['position'] = df['position'].ffill().fillna(0)
return calcReturns(df)


For the signal line, it’s fairly typical to look at the 9-day EMA, so that’s what we’ll use here. Again, no fiddling with the settings or optimization, we’re just taking a first pass at the backtest.

N_sl = 9
df_signal = MACDSignalStrategy(df.copy(), N_fast=N_fast,
N_slow=N_slow, N_sl=N_sl)

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
ax[0].plot(df_signal['strat_cum_returns']*100,
label='MACD Signal Strategy')
ax[0].plot(df_signal['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
f'Cumulative Returns for MACD Signal Strategy and' +
ax[0].legend()

ax[1].plot(df_signal['MACD'], label='MACD')
ax[1].plot(df_signal[f'SignalLine_{N_sl}'],
label='Signal Line', linestyle=':')
ax[1].set_xlabel('Date')
ax[1].set_ylabel('MACD')
ax[1].set_title(f'{N_fast}/{N_slow}/{N_sl} MACD for {ticker}')
ax[1].legend()

plt.tight_layout()
plt.show()


Here, our results weren’t nearly as great compared to the mean reversion model (the interplay between the signal line and the MACD is hard to discern at this time scale, zooming in will show its many crosses). This strategy does, however, avoid large or long drawdowns. It seems to be slow and steady, but does fail to win the race via a long-only strategy.

In a more complex strategy where you are managing a portfolio of strategies, an equity curve like we’re showing for the MACD signal strategy could be very valuable. The steady returns could provide a safe haven to allocate capital to if you’re high-return strategies suddenly underperform due to a shift in the market regime.

If you’re looking for all out returns, changing up some of the parameters could be a good way to go beyond the standard values we used here.

# MACD Momentum

Another technique that is frequently used is to take the difference between the MACD and the signal line. This is then plotted as vertical bars and called MACD bar or MACD histogram plots. If the bars grow over time, then we have increasing momentum and go long, if they decrease, then we have slowing momentum and a possible reversal.

Calculating this is rather straightforward, we just take our MACD and subtract off the signal line. Again, we can re-use a lot of our previous code to keep things short and sweet.

def calcMACDBars(data: pd.DataFrame, N_fast: int, N_slow: int,
N_sl: int = 9):
data = calcMACDSignal(data, N_fast=N_fast, N_slow=N_slow,
N_sl=N_sl)
data['MACD_bars'] = data['MACD'] - data[f'SignalLine_{N_sl}']
return data


The histogram is positive when the MACD is above the signal line, and negative when it falls below.

This is an indicator of an indicator and is a few steps removed from the actual price. Regardless, there are a variety of ways to trade using it. One way is to look for increasing or decreasing momentum. If consecutive, positive bars grow, it means that we have a bullish signal for increasing momentum as the MACD moves away from the signal line. If they’re moving closer to one another, then we have a bearish signal and can short it.

We can also look for crossovers, however this is the same as the signal line strategy we implemented above.

You’ll also see things such as peak-trough divergences or slant divergences to generate signals. These look for consecutive hills in the MACD histogram chart and are usually picked up via visual inspection.

We’re going to start with a momentum strategy that looks at the growth of consecutive bars. We’ll use the standard, 12/26/9 format for the MACD signal line parameters, but we’ll see if we can pick up on three consecutive days of growth (positive or negative) in the bars for buy/short signals.

def MACDMomentumStrategy(data, N_fast=12, N_slow=26, N_sl=9,
N_mom=3, shorts=False):
df = calcMACDBars(data, N_fast=N_fast, N_slow=N_slow, N_sl=N_sl)
df['growth'] = np.sign(df['MACD_bars'].diff(1))
df['momentum'] = df['growth'].rolling(N_mom).sum()
if shorts:
df['position'] = df['momentum'].map(
lambda x: np.sign(x) * 1 if np.abs(x) == N_mom else 0)
else:
df['position'] = df['momentum'].map(
lambda x: 1 if x == N_mom else 0)
df['position'] = df['position'].ffill().fillna(0)
return calcReturns(df)


Running and plotting:

N_mom = 3

df_mom = MACDMomentumStrategy(df.copy(), N_fast=N_fast,
N_slow=N_slow,N_sl=N_sl, N_mom=N_mom)

fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True)
ax[0].plot(df_mom['strat_cum_returns']*100, label='MACD Momentum')
ax[0].plot(df_mom['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(f'Cumulative Returns for MACD Signal Strategy and Buy and Hold for {ticker}')
ax[0].legend()
ax[1].plot(df_mom['MACD'], label='MACD')
ax[1].plot(df_mom[f'SignalLine_{N_sl}'], label='Signal Line',
linestyle=':')
ax[1].set_title(f'{N_fast}/{N_slow}/{N_sl} MACD for {ticker}')
ax[1].legend()
ax[2].bar(df_mom.index, df_mom['MACD_bars'], label='MACD Bars',
color=colors[4])
ax[2].set_xlabel('Date')
ax[2].set_title(f'MACD Bars for {ticker}')
plt.tight_layout()
plt.show()


This one doesn’t look so great. 20 years and losing 15% is not a strategy I want to be invested in. If we look a little deeper into this one, we can see that we jump in and out of a lot of positions very quickly. Usually we have some momentum, but because the momentum signal we’re looking at can change after a day or two, we aren’t in a position long enough to ride a trend. Essentially, we’re acting like a trend follower, but we bail so quickly we wind up taking a lot of small losses and have very few gains to show for it.

To alleviate this, let’s see if we can improve by using our MACD bars to signal momentum and only exit a position when the MACD bar crosses over. To see this, we’ll look for a sign change from positive to negative or vice versa.

def MACDSignalMomentumStrategy(data, N_fast=12, N_slow=26, N_sl=9,
N_mom=3, shorts=False):
df = calcMACDBars(data, N_fast=N_fast, N_slow=N_slow, N_sl=N_sl)
df['growth'] = np.sign(df['MACD_bars'].diff(1))
df['momentum'] = df['growth'].rolling(N_mom).sum()
# Enter a long/short position if momentum is going in the right
# direction and wait for cross-over
position = np.zeros(len(df)) + np.nan
for i, _row in enumerate(data.iterrows()):
row = _row[1]
mom = row['momentum']
if np.isnan(mom):
last_row = row.copy()
continue
if np.abs(mom) == N_mom and position[i-1] == 0:
# Enter new position
if shorts:
position[i] = np.sign(mom)
else:
position[i] = 1 if np.sign(mom) == 1 else 0
elif row['MACD_bars'] / last_row['MACD_bars'] < 0:
position[i] = 0
else:
# Hold position
position[i] = position[i-1]

df['position'] = position

return calcReturns(df)


Running this combined strategy:

df_sig_mom = MACDSignalMomentumStrategy(df.copy(), N_fast=N_fast,
N_slow=N_slow,N_sl=N_sl, N_mom=N_mom)
fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True)
ax[0].plot(df_sig_mom['strat_cum_returns']*100,
label='MACD Momentum')
ax[0].plot(df_sig_mom['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(f'Cumulative Returns for MACD Signal Strategy' +
f'and Buy and Hold for {ticker}')
ax[0].legend()
ax[1].plot(df_sig_mom['MACD'], label='MACD')
ax[1].plot(df_sig_mom[f'SignalLine_{N_sl}'], label='Signal Line',
linestyle=':')
ax[1].set_title(f'{N_fast}/{N_slow}/{N_sl} MACD for {ticker}')
ax[1].legend()
ax[2].bar(df_sig_mom.index, df_sig_mom['MACD_bars'],
label='MACD Bars', color=colors[4])
ax[2].set_xlabel('Date')
ax[2].set_title(f'MACD Bars for {ticker}')
plt.tight_layout()
plt.show()


This combined model does perform better, yielding us about 250% over this time frame. That is still about half of the long-only strategy that we’re trying to beat, but we could continue to tinker with these ideas to come up with something even better.

Before moving on, let’s take a look at some of the key metrics for each of these strategies.

## Comparing Strategies

Below, we use have a helper function to get our strategy statistics.

def getStratStats(log_returns: pd.Series,
risk_free_rate: float = 0.02):
stats = {}
# Total Returns
stats['tot_returns'] = np.exp(log_returns.sum()) - 1
# Mean Annual Returns
stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1

# Annual Volatility
stats['annual_volatility'] = log_returns.std() * np.sqrt(252)
# Sharpe Ratio
stats['sharpe_ratio'] = (
(stats['annual_returns'] - risk_free_rate)
/ stats['annual_volatility'])
# Max Drawdown
cum_returns = log_returns.cumsum()
peak = cum_returns.cummax()
drawdown = peak - cum_returns
stats['max_drawdown'] = drawdown.max()
# Max Drawdown Duration
strat_dd = drawdown[drawdown==0]
strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1]
strat_dd_days = strat_dd_diff.map(lambda x: x.days).values
strat_dd_days = np.hstack([strat_dd_days,
(drawdown.index[-1] - strat_dd.index[-1]).days])
stats['max_drawdown_duration'] = strat_dd_days.max()
return stats


Getting stats for each strategy:

stats_rev = getStratStats(df_reversion['strat_log_returns'])
stats_sig = getStratStats(df_signal['strat_log_returns'])
stats_mom = getStratStats(df_mom['strat_log_returns'])
stats_sig_mom = getStratStats(df_sig_mom['strat_log_returns'])
stats_base = getStratStats(df_reversion['log_returns'])
stats_dict = {'Mean Reversion': stats_rev,
'Signal Line': stats_sig,
'Momentum': stats_mom,
'Signal-Momentum': stats_sig_mom,
'Baseline': stats_base}
pd.DataFrame(stats_dict)


This baseline was pretty tough to beat. Over 9% annual returns despite some large drawdowns, however, the MACD mean reversion strategy beat it easily and with a better Sharpe Ratio to boot. The signal line and the momentum model both severely underperformed the baseline — with the MACD momentum model just being flat out atrocious. However, combining these two yielded better results, still below the baseline, but with a reasonable Sharpe Ratio and a lower drawdown (albeit longer than the baseline).

# No Guarantees

Each of these models could be tweaked and improved, but they’re only a small taste of the possibilities available to trade using the MACD and its derivatives. No matter how good that mean reversion strategy looks, nothing here should be blindly implemented.

While we ran a fine backtest, there are issues. Dividends weren’t taken into account, transaction costs were ignored, slippage, only one stock was examined, and so forth. If you really want to trade algorithmically, you’re better off with a strategy that manages a diversified portfolio, on data that has been adjusted for dividends, with transaction costs, over a long time horizon.

At Raposa, we’re building an engine that will enable you to quickly test ideas like these on a portfolio of securities, with proper money management, a plethora of signals, and high quality data, all without a single line of code. We have a state-of-the-art system that will allow you to customize your model to suit your needs, and deploy it when you’re ready so you can trade live. If you want to move beyond simple backtests and strategies, then sign up below to be kept up to date on our latest developments.

## 4 Ways to Trade the Trend Intensity Indicator

Determining the strength of a trend can provide a valuable edge to your trading strategy and help you determine when to go long and let it ride, or not. This is what the Trend Intensity Indicator (TII) was designed to do.

This indicator is as simple to interpret as more familiar values like the RSI. It’s scaled from 0-100 where higher numbers indicate a stronger upward trend, lower values a stronger downward trend, and values around the centerline (50) are neutral.

Calculating the TII takes place in 3 quick steps:

1. Calculate the Simple Moving Average (SMA) of your closing prices (60-day SMA is typical, i.e. P=60).
• SMA[t] = mean(Price[t-P:t])
2. Determine how many time periods close above the SMA over P/2 periods.
• PositivePeriods[t] = count(SMA[t-P/2:t]>Price[t-P/t:t]
3. Divide the number of positive periods by P/2 and multiply by 100 to scale from 0-100.
• TII[t] = PositivePeriods[t] / (P/2) * 100

## TL;DR

We provide backtests and code for four distinct trading strategies using the TII. These tests are designed to give you a feel for the indicator and running backtests. If you want something faster and more complete, check out our free, no-code platform here.

## Trend Intensity Indicator

Let’s get to coding this in Python:

import pandas as pd

def calcTrendIntensityIndex(data, P=60):
data[f'SMA_{P}'] = data['Close'].rolling(P).mean()
diff = data['Close'] - data[f'SMA_{P}']
pos_count = diff.map(lambda x: 1 if x > 0 else 0).rolling(int(P/2)).sum()
data['TII'] = 200 * (pos_count) / P
return data


The function is quick and easy.

Let’s grab some data and look a bit closer before trading it.

import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf

ticker = 'GDX'
start = '2011-01-01'
end = '2013-01-01'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start, end=end)
# Drop unused columns
data.drop(
['Open', 'High', 'Low', 'Volume', 'Dividends', 'Stock Splits'],
axis=1, inplace=True)

P = [30, 60, 90]

colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
ax[0].plot(data['Close'], label='Close')

for i, p in enumerate(P):
df = calcTrendIntensityIndex(data.copy(), p)
ax[0].plot(df[f'SMA_{p}'], label=f'SMA({p})')
ax[1].plot(df['TII'], label=f'TII({p})', c=colors[i+1])

ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and SMA Values for {ticker}') ax[0].legend() ax[1].set_xlabel('Date') ax[1].set_ylabel('TII') ax[1].set_title('Trend Intensity Index') ax[1].legend() plt.tight_layout() plt.show()  Just like the RSI and other oscillators, the TII ranges from 0-100. It does so in a steadier fashion than many of these other indicators because it’s simply counting the number of days where the price is greater than the SMA over the past P/2 periods (30 days in this case). It only moves in discrete steps up and down over time. Even though the TII is derived from the SMA crossover, it differs significantly from the standard indicator as shown below. P = 60 df = calcTrendIntensityIndex(data.copy(), P) vals = np.where(df[f'SMA_{P}'] > df['Close'], 1, 0) df['SMA_cross'] = np.hstack([np.nan, vals[:-1] - vals[1:]]) fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True) ax[0].plot(df['Close'], label='Close') ax[0].plot(df[f'SMA_{P}'], label=f'SMA({P})') ax[0].scatter(df.loc[df['SMA_cross']==1].index, df.loc[df['SMA_cross']==1][f'SMA_{P}'], marker='^', c=colors[3], label='Cross Above', zorder=100, s=100) ax[0].scatter(df.loc[df['SMA_cross']==-1].index, df.loc[df['SMA_cross']==-1][f'SMA_{P}'], marker='v', c=colors[4], label='Cross Below', zorder=100, s=100) ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'Price and {P}-Day SMA for {ticker}')
ax[0].legend()

ax[1].plot(df['TII'], label='TII')
ax[1].scatter(df.loc[df['SMA_cross']==1].index,
df.loc[df['SMA_cross']==1]['TII'],
marker='^', c=colors[3],
label='Cross Above', zorder=100, s=100)
ax[1].scatter(df.loc[df['SMA_cross']==-1].index,
df.loc[df['SMA_cross']==-1]['TII'],
marker='v', c=colors[4],
label='Cross Below', zorder=100, s=100)
ax[1].set_xlabel('Date')
ax[1].set_ylabel('TII')
ax[1].set_title('Trend Intensity Index')
ax[1].legend()
plt.tight_layout()
plt.show()


In this plot, we have the 60-day SMA marked with orange triangles when the price moves above the SMA and blue triangles showing when it drops below the SMA. As can be seen in the lower plot, these occur all over the TII. So you can’t necessarily tell when a cross over occurs by looking at the TII. While it is a straightforward derivative of the SMA crossover, you’re going to have very different signals when trading it.

Let’s turn to a few examples of how we can use it.

## Buy Low and Sell High

The first model we’ll run approaches the TII in a similar fashion as the RSI; we buy when the TII crosses below 20 and sell/short when it breaks above 80. We hold the positions until they reach the exit value.

The basic idea is that the price should be split roughly between days above and below the SMA. When we have a series that has deviated from the long-run average, we put a position on and wait for it to revert back to the mean.

def TIIMeanReversion(data, P=60, enter_long=20, exit_long=50,
enter_short=80, exit_short=50, shorts=True):
df = calcTrendIntensityIndex(data, P=P)
df['position'] = np.nan
df['position'] = np.where(df['TII']<=enter_long, 1, np.nan)
_exit_long = df['TII'] - exit_long
exit = _exit_long.shift(1) / _exit_long
df['position'] = np.where(exit<0, 0, df['position'])

if shorts:
df['position'] = np.where(df['TII']>=enter_short, -1,
df['position'])
_exit_short = df['TII'] - exit_short
df['position'] = np.where(exit<0, 0, df['position'])

df['position'] = df['position'].ffill().fillna(0)
return calcReturns(df)

# A few helper functions
def calcReturns(df):
# Helper function to avoid repeating too much code
df['returns'] = df['Close'] / df['Close'].shift(1)
df['log_returns'] = np.log(df['returns'])
df['strat_returns'] = df['position'].shift(1) * df['returns']
df['strat_log_returns'] = df['position'].shift(1) * df['log_returns']
df['cum_returns'] = np.exp(df['log_returns'].cumsum()) - 1
df['strat_cum_returns'] = np.exp(df['strat_log_returns'].cumsum()) - 1
df['peak'] = df['cum_returns'].cummax()
df['strat_peak'] = df['strat_cum_returns'].cummax()
return df

def getStratStats(log_returns: pd.Series,
risk_free_rate: float = 0.02):
stats = {}  # Total Returns
stats['tot_returns'] = np.exp(log_returns.sum()) - 1

# Mean Annual Returns
stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1

# Annual Volatility
stats['annual_volatility'] = log_returns.std() * np.sqrt(252)

# Sortino Ratio
annualized_downside = log_returns.loc[log_returns<0].std() * \
np.sqrt(252)
stats['sortino_ratio'] = (stats['annual_returns'] - \
risk_free_rate) / annualized_downside

# Sharpe Ratio
stats['sharpe_ratio'] = (stats['annual_returns'] - \
risk_free_rate) / stats['annual_volatility']

# Max Drawdown
cum_returns = log_returns.cumsum() - 1
peak = cum_returns.cummax()
drawdown = peak - cum_returns
max_idx = drawdown.argmax()
stats['max_drawdown'] = 1 - np.exp(cum_returns[max_idx]) \
/ np.exp(peak[max_idx])

# Max Drawdown Duration
strat_dd = drawdown[drawdown==0]
strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1]
strat_dd_days = strat_dd_diff.map(lambda x: x.days).values
strat_dd_days = np.hstack([strat_dd_days,
(drawdown.index[-1] - strat_dd.index[-1]).days])
stats['max_drawdown_duration'] = strat_dd_days.max()
return {k: np.round(v, 4) if type(v) == np.float_ else v
for k, v in stats.items()}


I also added a few helper functions to get our returns and statistics for easy plotting and comparison.

Let’s get some extra data for our ticker to see if we can hit it big more than our gold mining ETF.

ticker = 'GDX'
start = '2000-01-01'
end = '2020-12-31'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start, end=end)
# Drop unused columns
data.drop(
['Open', 'High', 'Low', 'Volume', 'Dividends', 'Stock Splits'],
axis=1, inplace=True)

P = 60
enter_long = 20
enter_short = 80
df_mr = TIIMeanReversion(data.copy(), P=P, enter_long=enter_long,
enter_short=enter_short)

fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True)
ax[0].plot(df_mr['Close'], label='Close')
ax[0].plot(df_mr[f'SMA_{P}'], label=f'SMA({P})')
ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and {P}-Day SMA for {ticker}') ax[0].legend() ax[1].plot(df_mr['TII'], label='TII') ax[1].axhline(enter_short, c=colors[1], label='Short Signal Line', linestyle=':') ax[1].axhline(enter_long, c=colors[2], label='Long Signal Line', linestyle=':') ax[1].axhline(50, c=colors[3], label='Exit Line', linestyle=':') ax[1].set_ylabel('TII') ax[1].set_title('Trend Intensity Index') ax[1].legend(bbox_to_anchor=[1, 0.65]) ax[2].plot(df_mr['strat_cum_returns']*100, label='Mean Reversion') ax[2].plot(df_mr['cum_returns']*100, label='Buy and Hold') ax[2].set_title('Cumulative Returns') ax[2].set_ylabel('Returns (%)') ax[2].set_xlabel('Date') ax[2].legend() plt.tight_layout() plt.show() df_stats = pd.DataFrame(getStratStats(df_mr['log_returns']), index=['Buy and Hold']) df_stats = pd.concat([df_stats, pd.DataFrame(getStratStats( df_mr['strat_log_returns']), index=['Mean Reversion'])]) df_stats  The mean reversion model yields some reasonable returns above and beyond the buy and hold approach. The risk adjusted metrics aren’t great, but it is a good boost against the underlying. ## Momentum Trading Like RSI and other oscillators, we can interpret the TII as a momentum indicator, buying when it breaks above the centerline and selling/shorting when it crosses below it. For this, we’ll add a bit of a wrinkle such that we’ll hold while the TII goes above an upper level and then sell if it proceeds to fall below that level. We treat the short side the same way. This will get us out of positions sooner, hopefully with a larger profit than if we held until it reversed all the way back to the centerline again. def TIIMomentum(data, P=60, centerline=50, upper=80, lower=20, shorts=True): df = calcTrendIntensityIndex(data, P=P) position = np.zeros(df.shape[0]) for i, (idx, row) in enumerate(df.iterrows()): if np.isnan(row['TII']): last_row = row.copy() continue if row['TII'] >= centerline and last_row['TII'] < centerline: # Go long if no position if position[i-1] != 1: position[i] = 1 elif row['TII'] > centerline and position[i-1] == 1: # Check if broke below upper line if last_row['TII'] > upper and row['TII'] <= upper: position[i] = 0 else: position[i] = 1 elif position[i-1] == 1 and row['TII'] <= centerline: # Sell/short if broke below centerline if shorts: position[i] = -1 else: position[i] = 0 elif shorts: if row['TII'] <= centerline and last_row['TII'] > centerline: # Go short if no position if position[i-1] != -1: position[i] = -1 elif row['TII'] <= centerline and position[i-1] == -1: # Check if broke above lower line if last_row['TII'] < lower and row['TII'] > lower: position[i] = 0 else: position[i] = -1 elif position[i-1] == -1 and row['TII'] > centerline: # Exit short and go long if it crosses position[i] = 1 last_row = row.copy() df['position'] = position return calcReturns(df)  The easiest way to get the logic right is just by looping over the data and buying/selling in each of the different cases. The code is a bit longer, but hopefully easier to understand. Let’s move on to testing it with our standard settings. P = 60 df_mom = TIIMomentum(data.copy(), P=P) fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True) ax[0].plot(df_mom['Close'], label='Close') ax[0].plot(df_mom[f'SMA_{P}'], label=f'SMA({P})') ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'Price and {P}-Day SMA for {ticker}')
ax[0].legend()

ax[1].plot(df_mom['TII'], label='TII')
ax[1].axhline(20, c=colors[1],
label='Exit Short Signal Line', linestyle=':')
ax[1].axhline(80, c=colors[2],
label='Exit Long Signal Line', linestyle=':')
ax[1].axhline(50, c=colors[3],
label='Center Line', linestyle=':')
ax[1].set_ylabel('TII')
ax[1].set_title('Trend Intensity Index')
ax[1].legend(bbox_to_anchor=[1, 0.65])

ax[2].plot(df_mom['strat_cum_returns']*100, label='Momentum')
ax[2].set_title('Cumulative Returns')
ax[2].set_ylabel('Returns (%)')
ax[2].set_xlabel('Date')
ax[2].legend()
plt.tight_layout()
plt.show()

df_stats = pd.concat([df_stats,
pd.DataFrame(getStratStats(
df_mom['strat_log_returns']),
index=['Momentum'])])
df_stats


This strategy lost over 60% of its initial capital – clearly not what we were hoping for. The underlying is fairly volatile, so it may perform better if we reduce P to try to take advantage of some of these shorter moves. I leave that as an exercise to the reader.

Like the MACD, the TII is often traded with a signal line. This signal line is just the EMA of the TII (i.e. EMA(TII(P))).
To calculate this, we need to grab a couple of functions for the EMA calculation.

def _calcEMA(P, last_ema, N):
return (P - last_ema) * (2 / (N + 1)) + last_ema

def calcEMA(series, N):
series_mean = series.rolling(N).mean()
ema = np.zeros(len(series))
for i, (_, val) in enumerate(series_mean.iteritems()):
if np.isnan(ema[i-1]):
ema[i] += val
else:
ema[i] += _calcEMA(val, ema[i-1], N)
return ema


We’ll use these functions to calculate the signal line. We’ll buy when the TII crosses above the signal line, and sell/short when the signal line crosses below.

We’ll code this up below:

def TIISignal(data, P=60, N=9, shorts=True):
df = calcTrendIntensityIndex(data, P=P)
df['SignalLine'] = calcEMA(df['TII'], N=N)
df['position'] = np.nan
df['position'] = np.where(df['TII']>=df['SignalLine'], 1,
df['position'])
if shorts:
df['position'] = np.where(df['TII']<df['SignalLine'], -1,
df['position'])
else:
df['position'] = np.where(df['TII']<df['SignalLine'], 0,
df['position'])

df['position'] = df['position'].ffill().fillna(0)
return calcReturns(df)


This strategy is short and sweet. Just buy on those signal line cross-overs.

P = 60
N = 9
df_sig = TIISignal(data.copy(), P=P, N=N)

fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True)
ax[0].plot(df_sig['Close'], label='Close')
ax[0].plot(df_sig[f'SMA_{P}'], label=f'SMA({P})')
ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and {P}-Day SMA for {ticker}') ax[0].legend() ax[1].plot(df_sig['TII'], label='TII') ax[1].plot(df_sig['SignalLine'], label='Signal Line') ax[1].set_ylabel('TII') ax[1].set_title('Trend Intensity Index and Signal Line') ax[1].legend(bbox_to_anchor=[1, 0.65]) ax[2].plot(df_sig['strat_cum_returns']*100, label='Signal Line') ax[2].plot(df_sig['cum_returns']*100, label='Buy and Hold') ax[2].set_title('Cumulative Returns') ax[2].set_ylabel('Returns (%)') ax[2].set_xlabel('Date') ax[2].legend() plt.tight_layout() plt.show() df_stats = pd.concat([df_stats, pd.DataFrame(getStratStats( df_sig['strat_log_returns']), index=['Signal'])]) df_stats  This model missed out on the big gold bull market from 2008-2011, enduring a long and protracted drawdown. It did have some great returns as the gold price fell into 2016. After running flat for a few years, closely following the GDX, it started to pick up with the underlying and managed to get on the right side of the COVID crash – but then gave it all back as the GDX rebounded quickly. Lots of volatility here and not much upside leads to some poor risk adjusted metrics. You can always adjust the standard P and N values used to see if you can find some better parameters for this one. ## Trading with Signal Momentum I decided to throw in a bonus model now that we have the signal line. We can use that as a classic momentum indicator buying when the TII signal crosses the centerline. In this case, I didn’t bother with trying to hit the tops/bottoms when momentum starts to wane like we did above, just buy and sell based on the centerline, which drastically simplifies the code. def TIISignalMomentum(data, P=60, N=9, centerline=50, shorts=True): df = calcTrendIntensityIndex(data, P=P) df['SignalLine'] = calcEMA(df['TII'], N=N) df['position'] = np.nan df['position'] = np.where(df['SignalLine']>centerline, 1, df['position']) if shorts: df['position'] = np.where(df['SignalLine']<centerline, -1, df['position']) else: df['position'] = np.where(df['SignalLine']<centerline, 0, df['position']) df['position'] = df['position'].ffill().fillna(0) return calcReturns(df)  And to test it on our data: P = 60 N = 9 df_sigM = TIISignalMomentum(data.copy(), P=P, N=N) fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True) ax[0].plot(df_sigM['Close'], label='Close') ax[0].plot(df_sigM[f'SMA_{P}'], label=f'SMA({P})') ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'Price and {P}-Day SMA for {ticker}')
ax[0].legend()

ax[1].plot(df_sigM['TII'], label='TII')
ax[1].plot(df_sigM['SignalLine'], label='Signal Line')
ax[1].axhline(50, label='Centerline', c='k', linestyle=':')
ax[1].set_ylabel('TII')
ax[1].set_title('Trend Intensity Index and Signal Line')
ax[1].legend(bbox_to_anchor=[1, 0.65])

ax[2].plot(df_sigM['strat_cum_returns']*100, label='Signal Line')
ax[2].set_title('Cumulative Returns')
ax[2].set_ylabel('Returns (%)')
ax[2].set_xlabel('Date')
ax[2].legend()
plt.tight_layout()
plt.show()

df_stats = pd.concat([df_stats,
pd.DataFrame(getStratStats(
df_sigM['strat_log_returns']),
index=['Signal Momentum'])])
df_stats


This wound up being the worst performer of them all! This strategy lost over 70% of its initial capital and spent 17 years in a drawdown. I don’t know anyone who would have the patience to stick with a strategy to endure those kinds of losses.

Of course, you’ve got the standard knobs to play with to try to pull some better returns out of this one. Plus you could try adding the complexity from the previous momentum strategy to see how that could boost returns.

The TII is designed to measure the strength of a trend. We walked through four different trading strategies, but none really shined. That could be due to a variety of factors – poor selection of the underlying security, bad parameterization, it ought to be combined with other indicators, or maybe this indicator is just terrible.

The only way to know for sure is by testing it.

Proper testing and trading is difficult. The code here should not be relied upon for running your account – it makes too many simplifying assumptions for a real system. It’s just here to teach you about various indicators, running a backtest, and give you a feel for some of the key metrics used to evaluate a strategy.

If you want a fuller backtest experience, check us out where we provide professional backtests and data in a no-code framework so you can test ideas and find valuable trading ideas quickly.

In 1982, a group of inexperienced traders were recruited to be a part of an experiment that would make many of them multi-millionaires.

Richard Dennis bet his partner William Eckhardt that anyone could be a successful trader given they had training and a system to follow. It was a re-hash of the nature vs nurture debate, but now with millions of dollars on the line.

Dennis and Eckhardt trained their turtles and furnished each $1,000,000 to trade. Over the next 4 years, the traders returned over$175 million following Dennis and Eckhardt’s system.

Below, we break down the rules of this system and implement it in Python.

## The Complete Turtle Trader Rules

There are some different versions of the Turtle rules online, so I’m going to choose one set and stick with it. I’m going with the rules laid out in Chapter 5 of Michael Covel’s book The Complete Turtle Trader, because they are some of the most detailed and complete. I’ll explain where I may deviate from Covel’s rules or feel the need to provide some additional details and examples – because there are some rules which are prone to cause confusion!

The Turtles are trend followers, meaning they’re looking for price breakouts (closing highs or lows over a given lookback period) to buy an instrument (in the case of highs) or short (in the case of lows) it. They actually traded two separate but closely related systems called System 1 (S1) and System 2 (S2).

S2 is slower than S1. S2 watches for a 55-day breakout for an entry and a 20-day breakout in the opposite direction to sell (which I’ll refer to as a breakdown). For example, if the system goes long after a breakout (i.e., when the closing price of the stock exceeds the previous 55-day high), then it will close the position when the price hits the lowest close over the past 20 trading days.

S1 looks for a 20-day breakout for an entry and 10-day breakdown for an exit. The other slight wrinkle for S1 is a filter that causes it to trade every other successful breakout. This means if some signal turns into a profitable trade, then the next time that signal comes up, it skips it.

Turtles were given the freedom to allocate whatever percentage of their capital to either system.

All of these are variables in the code below so you can play with them as you see fit.

## Position Sizing

The Turtles used a volatility-based position sizing method to normalize their risk across contracts and instruments. The position size was calculated using the 20-day simple moving average (SMA-20) of the True Range (TR) of the price. TR is just the largest absolute value of the difference between the high, low, and close:

TR = \textrm{max}\big(\mid High - Low \mid, \mid High - Close \mid, \mid Close - Low \mid \big)

The Average True Range (ATR) is just the moving average of this value, which we calculate for the past 20-days.

In Turtle jargon, this 20-day ATR is called N, which is used as a proxy for risk and to calculate position size. Higher ATR means a higher volatility and a higher value of N and increased risk. Unfortunately, this is where things often get confusing.

Covel writes that the Turtles would bet 2% of their equity on each trade, which was, again in Turtle jargon, called a unit. If you have $10,000, then you have 50 units to trade, each of$200. To determine how much you actually invest, you weight the dollar volatility of the instrument by N and a constant value to adjust your risk (e.g. 2).

What’s dollar volatility? This is the amount of value a $1 change in the contract would impact your portfolio. For example, if you’re trading the Bitcoin futures contract on the CME, you have a contract size of 5 BTC. So a$1 change in the contract actually yields a $5 change to your portfolio meaning your dollar volatility is 5. Likewise, a gold contract is quoted in price per troy ounce, however the contract consists of 100 troy ounces so you have a dollar volatility of 100. Turtles use these values to get their position size in the number of contracts as follows: U = \textrm{floor}\bigg( \frac{fC}{r_t N D} \bigg) where U is the size of 1 unit of risk. C is the total capital you have to trade, f is the fraction of capital you use to scale your unit, r_t is a risk coefficient, and D is your dollar volatility. The floor function just rounds down to the nearest whole number. Unit has a habit of causing some confusion, so let’s address that quickly. The Turtles used the concept of a unit to scale different instruments by the risk they wanted, as measured by N. So if you are trading Bitcoin (which is very volatile) and German 30-year Bunds (which are much less volatile by comparison), but want to have equal risk in each, you assign 1 unit to each, and then buy the number of contracts or shares required to achieve that unit. To show a quick example, if you have$100,000 of capital and got a signal on that Bitcoin contract. You’re devoting 2% to the trade and set your risk adjustment to 2, and N over this period is 120 (which would be crazy low for Bitcoin, but this is just an example). Then your system would dictate that you should buy 1 contract:

U = \textrm{floor}\bigg(\frac{0.02 \times 100,000}{2 \times 120 \times 5}\bigg) = \textrm{floor}(1.667) = 1\textrm{ contract}

This means that a single unit of risk can be purchased for a single Bitcoin contract. The higher the N, the fewer contracts you need to purchase to get the same risk exposure.

To apply this to trading stocks, we use the same formula but replace D with our share price P.

U = \textrm{floor}\bigg( \frac{fC}{r_t N P} \bigg)

Imagine we’re trading Apple at P= $150/share with N=2.5, then we’d buy: U = \textrm{floor}\bigg(\frac{0.02 \times 100,000}{2 \times 2.5 \times 150}\bigg) = \textrm{floor}(2.667) = 2 \textrm{ shares} You read that right, just two shares of Apple for a$100,000 account.

This system is designed to be diversified – you never know where the trend is going to come from – so you hold small positions in lots of different instruments, then you pyramid into winners over time (discussed below). The idea of all of this volatility unit weighting is to adjust your position such that one unit of instrument A is equivalent in terms of risk to one unit of instrument B. That way a diversified portfolio of offsetting risks can be easily managed.

One last thing about position sizing. The Turtles had rules to reduce their position size as they lose capital. For every 10% they lose, they reduce their unit size by 20%, then they add that back as they recover their capital and trade full units again.

## Pyramiding

Trends can take off and run for quite some time, so Turtles use the concept of pyramiding – adding to winning positions – to grow their exposure and winnings with the trend.

Again, we’re going to rely on N to make our moves here. If the price moves up 1N above the original price, then we add 1 unit to our position. Additionally, we’ll use a trailing stop at this point and move all of our stops to the current price minus 2N. We can continue to pyramid on top of a trend until we have 5 units, then we let it run until we exit.

## Exiting

Our exits were touched on briefly, but to make them explicit, Turtles exit a trade when the price hits their stop or they get a breakout in the opposite direction of the trade they opened.

## Markets

When they started out, the Turtles had almost two-dozen instruments to trade including US bonds of various durations, cotton, sugar, gold, coffee, crude oil, heating oil, gasoline, S&P 500 futures, silver and a handful of currencies like the Swiss Franc, French Franc, Deutschmark, British Pound, Eurodollar, and Japanese Yen. Some of these don’t exist anymore, namely the French Franc and Deutschmark which have been replaced by the Euro, so a modern basket would be a bit different. The bigger issue with this set up is the amount of capital that would be required to trade many of these.

Futures do allow significant margins, however prices may still be out of the range for a typical retail investor. Take gold for example. Currently a 100 oz contract would cost roughly $180,000 without margin. If you could leverage that 20:1, you’d still be required to put up$9,000. If we’re using the conservative position sizing of the Turtles, it would probably take a few years of great returns or a huge crash in the gold price to be able to squeeze a single contract into your system.

There are other complexities with commodities such as rolling contracts to avoid expiration, getting the data, margin calculations, and so forth. To try to keep the code simple, we’ll just trade a random group of equities for this example. This is certainly NOT recommended because of correlation issues, but will allow us to demonstrate the principles of the system.

## Position Limits and Correlations

If you haven’t noticed by now, the most important thing in this system is controlling your risk. Risk is primarily controlled via position sizing and is what enables trend followers like the Turtles to rack up great returns, when they lose (which they do often) they don’t lose very much.

In that vein, we have position limits that the system respects. Any entry is going to be limited to just a few units (4 or 5) and pyramiding is capped at 5. Moreover, correlated positions impose limits on trades.

Because some of the contracts the Turtles traded were highly correlated (e.g. silver and gold or different durations of bonds, types of oil, etc.) they were considered to be more or less the same market. One long unit of 10-year US Treasuries is highly correlated with another long unit of 30-year US Treasuries, so the Turtles would consider this to be net 2 long units. So the total portfolio risk would be twice as risky as being long one and short the other, which would be more or less neutral because of the correlations.

To do Turtle trading properly, we’d be including commodities and checking for uncorrelated markets. Like I mentioned above, we’re taking a little shortcut here and just grabbing some stocks – which will likely be very correlated because they’re all coming from the S&P 500. It would be interesting to see, however, if an equity portfolio could be constructed with stocks and ETFs that mirror commodity markets to mimic a Turtle (or general CTA) strategy with equities and potentially less capital and simpler rules.

While the system doesn’t rely on any complex mathematics, it has a few moving parts and so may be helpful to look at step-by-step before getting into the details of the code.

For trading or backtesting, we’re going to be running a big loop where the system gets new daily tick data, and everything executes from there. We can describe it as follows:

1. Get new data.

2. Calculate N.

3. For each system (S1 and S2) get breakouts/breakdowns (if any).

4. For each system-instrument combination do the following:

• If a breakout occurs and no current position is open, size the position according to the rules and enter a position (long or short depending on breakout direction). For S1 only: ignore the breakout if the last breakout led to a winning trade.
• If a position is open and the price has moved 2N above the entry point, add one unit – sized according to the rules – to the position.
• If a position is open and a breakdown occurs, liquidate the position.
• If a position is open and the price hits the stop loss, liquidate the position. – Otherwise, do nothing.

At a high level, the procedure is pretty simple. The complexity comes in the execution of the model and handling all of the little details.

So let’s get to coding this system!

To start, import a few packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
from copy import deepcopy, copy


We’re going to need a few helper functions to make calculations and then to analyze our data afterwards.

Let’s start with the a function for calculating the True Range (TR).

def calcTR(high, low, close):
'''Calculate True Range'''
return np.max(np.abs([high-low, close-low, low-close]))


Next, we’ll write some code to look at our returns so we can get our risk metrics.

def getStratStats(log_returns: pd.Series,
risk_free_rate: float = 0.02):
stats = {}  # Total Returns
stats['tot_returns'] = np.exp(log_returns.sum()) - 1

# Mean Annual Returns
stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1

# Annual Volatility
stats['annual_volatility'] = log_returns.std() * np.sqrt(252)

# Sortino Ratio
annualized_downside = log_returns.loc[log_returns<0].std() * \
np.sqrt(252)
stats['sortino_ratio'] = (stats['annual_returns'] - \
risk_free_rate) / annualized_downside

# Sharpe Ratio
stats['sharpe_ratio'] = (stats['annual_returns'] - \
risk_free_rate) / stats['annual_volatility']

# Max Drawdown
cum_returns = log_returns.cumsum() - 1
peak = cum_returns.cummax()
drawdown = peak - cum_returns
max_idx = drawdown.argmax()
stats['max_drawdown'] = 1 - np.exp(cum_returns[max_idx]) \
/ np.exp(peak[max_idx])

# Max Drawdown Duration
strat_dd = drawdown[drawdown==0]
strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1]
strat_dd_days = strat_dd_diff.map(lambda x: x.days).values
strat_dd_days = np.hstack([strat_dd_days,
(drawdown.index[-1] - strat_dd.index[-1]).days])
stats['max_drawdown_duration'] = strat_dd_days.max()
return {k: np.round(v, 4) if type(v) == np.float_ else v
for k, v in stats.items()}


Finally, we’ll jump into the TurtleSystem itself. I’ll give the model below and some explanation of a few of the key methods. Hopefully the code is relatively self-explanatory.

class TurtleSystem:

def __init__(self, tickers, init_account_size=10000, risk_level=2, r_max=0.02,
sys1_entry=20, sys1_exit=10, sys2_entry=55, sys2_exit=20,
atr_periods=20, sys1_allocation=0.5, risk_reduction_rate=0.1,
risk_reduction_level=0.2, unit_limit=5, pyramid_units=1,
start='2000-01-01', end='2020-12-31', shorts=True):
'''
:tickers: list of security symbols to trade.
:init_account_size: int that sets initial trading capital
:risk_level: float used to determine the stop loss distance by multiplying
this value with N.
:r_max: float max percentage of account that a trade can risk.
:sys1_entry: int determines number of breakout days for System 1 to generate
:sys1_exit: int determines number of breakout days for System 1 to generate
a sell signal.
:sys2_entry: int determines number of breakout days for System 2 to generate
:sys2_exit: int determines number of breakout days for System 2 to generate
a sell signal.
:sys1_allocation: float to balance capital allocation
between System 1 and 2.
:start: str first date for getting data.
:end: str end date for getting data.
:shorts: bool to allow short positions if True.
:atr_periods: int number of days used to calculate SMA of N.
:risk_reduction_rate: float < 1 represents the amount of loss the system
sees before it reduces its trading size.
:risk_reduction_level: float < 1 represents each increment in risk the
the system reduces as it loses capital below its initial size.
'''
self.tickers = tickers
self.init_account_size = init_account_size
self.cash = init_account_size
self.portfolio_value = init_account_size
self.risk_level = risk_level
self.r_max = r_max
self.sys1_entry = sys1_entry
self.sys1_exit = sys1_exit
self.sys2_entry = sys2_entry
self.sys2_exit = sys2_exit
self.sys1_allocation = sys1_allocation
self.sys2_allocation = 1 - sys1_allocation
self.start = start
self.end = end
self.atr_periods = atr_periods
self.shorts = shorts
self.last_s1_win = {t: False for t in self.tickers}
self.unit_limit = unit_limit
self.risk_reduction_level = risk_reduction_level
self.risk_reduction_rate = risk_reduction_rate
self.pyramid_units = pyramid_units
self.sys_list = ['S1', 'S2']

self._prep_data()

def _prep_data(self):
self.data = self._get_data()
self._calc_breakouts()
self._calc_N()

def _get_data(self):
# Gets data for all tickers from YFinance
yfObj = yf.Tickers(self.tickers)
df = yfObj.history(start=self.start, end=self.end)
df.drop(['Open', 'Dividends', 'Stock Splits', 'Volume'],
inplace=True, axis=1)
df.ffill(inplace=True)
return df.swaplevel(axis=1)

def _calc_breakouts(self):
# Gets breakouts for all tickers
for t in self.tickers:
# Breakouts for enter long position (EL), exit long (ExL)
# enter short (ES), exit short (ExS)
self.data[t, 'S1_EL'] = self.data[t]['Close'].rolling(
self.sys1_entry).max()
self.data[t, 'S1_ExL'] = self.data[t]['Close'].rolling(
self.sys1_exit).min()
self.data[t, 'S2_EL'] = self.data[t]['Close'].rolling(
self.sys2_entry).max()
self.data[t, 'S2_ExL'] = self.data[t]['Close'].rolling(
self.sys2_exit).min()

if self.shorts:
self.data[t, 'S1_ES'] = self.data[t]['Close'].rolling(
self.sys1_entry).min()
self.data[t, 'S1_ExS'] = self.data[t]['Close'].rolling(
self.sys1_exit).max()
self.data[t, 'S2_ES'] = self.data[t]['Close'].rolling(
self.sys2_entry).min()
self.data[t, 'S2_ExS'] = self.data[t]['Close'].rolling(
self.sys2_exit).max()

def _calc_N(self):
# Calculates N for all tickers
for t in self.tickers:
tr = self.data[t].apply(
lambda x: calcTR(x['High'], x['Low'], x['Close']), axis=1)
self.data[t, 'N'] = tr.rolling(self.atr_periods).mean()

def _check_cash_balance(self, shares, price):
# Checks to see if we have enough cash to make purchase.
# If not, resizes position to lowest feasible level
if self.cash <= shares * price:
shares = np.floor(self.cash / price)
return shares

# Scales units down by 20% for every 10% of capital that has been lost
# under default settings.
cap_loss = 1 - self.portfolio_value / self.init_account_size
if cap_loss > self.risk_reduction_level:
scale = np.floor(cap_loss / self.risk_reduction_level)
units *= (1 - scale * self.risk_reduction_rate)
return units

def _calc_portfolio_value(self, portfolio):
pv = sum([v1['value'] for v0 in portfolio.values() if type(v0) is dict
for k1, v1 in v0.items() if v1 is not None])
pv += self.cash
if np.isnan(pv):
raise ValueError(f"PV = {pv}\n{portfolio}")
return pv

def _get_units(self, system):
sys_all = self.sys1_allocation if system == 1 else self.sys2_allocation
dollar_units = self.r_max * self.portfolio_value * sys_all
return dollar_units

def _size_position(self, data, dollar_units):
shares = np.floor(dollar_units / (
self.risk_level * data['N'] * data['Close']))
return shares

def _run_system(self, ticker, data, position, system=1):
S = system # System number
price = data['Close']
if np.isnan(price):
# Return current position in case of missing data
return position
N = data['N']
dollar_units = self._get_units(S)
shares =  0
if position is None:
if price == data[f'S{S}_EL']: # Buy on breakout
if S == 1 and self.last_s1_win[ticker]:
self.last_s1_win[ticker] = False
return None
shares = self._size_position(data, dollar_units)
stop_price = price - self.risk_level * N
long = True
elif self.shorts:
if price == data[f'S{S}_ES']: # Sell short
if S == 1 and self.last_s1_win[ticker]:
self.last_s1_win[ticker] = False
return None
shares = self._size_position(data, dollar_units)
stop_price = price + self.risk_level * N
long = False
else:
return None
if shares == 0:
return None
# Ensure we have enough cash to trade
shares = self._check_cash_balance(shares, price)
value = price * shares

self.cash -= value
position = {'units': 1,
'shares': shares,
'entry_price': price,
'stop_price': stop_price,
'entry_N': N,
'value': value,
'long': long}
if np.isnan(self.cash) or self.cash < 0:
raise ValueError(f"Cash Error\n{S}-{ticker}\n{data}\n{position}")

else:
if position['long']:
# Check to exit existing long position
if price == data[f'S{S}_ExL'] or price <= position['stop_price']:
self.cash += position['shares'] * price
if price >= position['entry_price']:
self.last_s1_win[ticker] = True
else:
self.last_s1_win[ticker] = False
position = None
# Check to pyramid existing position
elif position['units'] < self.unit_limit:
if price >= position['entry_price'] + position['entry_N']:
shares = self._size_position(data, dollar_units)
shares = self._check_cash_balance(shares, price)
self.cash -= shares * price
stop_price = price - self.risk_level * N
avg_price = (position['entry_price'] * position['shares'] +
shares * price) / (position['shares'] + shares)
position['entry_price'] = avg_price
position['shares'] += shares
position['stop_price'] = stop_price
position['units'] += 1
else:
# Check to exit existing short position
if price == data[f'S{S}_ExS'] or price >= position['stop_price']:
self.cash += position['shares'] * price
if S == 1:
if price <= position['entry_price']:
self.last_s1_win[ticker] = True
else:
self.last_s1_win[ticker] = False
position = None
# Check to pyramid existing position
elif position['units'] < self.unit_limit:
if price <= position['entry_price'] - position['entry_N']:
shares = self._size_position(data, dollar_units)
shares = self._check_cash_balance(shares, price)
self.cash -= shares * price
stop_price = price + self.risk_level * N
avg_price = (position['entry_price'] * position['shares'] +
shares * price) / (position['shares'] + shares)
position['entry_price'] = avg_price
position['shares'] += shares
position['stop_price'] = stop_price
position['units'] += 1

if position is not None:
# Update value at each time step
position['value'] = position['shares'] * price

return position

def run(self):
# Runs backtest on the turtle strategy
self.portfolio = {}
position = {s:
{t: None for t in self.tickers}
for s in self.sys_list}

for i, (ts, row) in enumerate(self.data.iterrows()):
for t in self.tickers:
for s, system in enumerate(self.sys_list):
position[system][t] = self._run_system(t, row[t], position[system][t])
self.portfolio[i] = deepcopy(position)
self.portfolio[i]['date'] = ts
self.portfolio[i]['cash'] = copy(self.cash)
self.portfolio_value = self._calc_portfolio_value(self.portfolio[i])

def get_portfolio_values(self):
vals = []
for v in self.portfolio.values():
pv = sum([v1['value'] for v0 in v.values() if type(v0) is dict
for k1, v1 in v0.items() if v1 is not None])
pv += v['cash']
vals.append(pv)
return pd.Series(vals, index=self.data.index)

def get_system_data_dict(self):
sys_dict = {}
cols = ['units', 'shares', 'entry_price', 'stop_price',
'entry_N', 'value', 'long']
X = np.empty(shape=(len(cols)))
X[:] = np.nan
index = [v['date'] for v in self.portfolio.values()]
for s in self.sys_list:
for t in self.tickers:
df = pd.DataFrame()
for i, v in enumerate(self.portfolio.values()):
d = v[s][t]
if d is None:
if i == 0:
_array = X.copy()
else:
_array = np.vstack([_array, X])
else:
vals = np.array([float(d[i]) for i in cols])
if i == 0:
_array = vals.copy()
else:
_array = np.vstack([_array, vals])
df = pd.DataFrame(_array, columns=cols, index=index)
sys_dict[(s, t)] = df.copy()
return sys_dict

def get_transactions(self):
ddict = self.get_system_data_dict()
transactions = pd.DataFrame()
for k, v in ddict.items():
df = pd.concat([v, self.data[k[1]].copy()], axis=1)
df.fillna(0, inplace=True)
rets = df['Close'] / df['entry_price'].shift(1) -1
trans = pd.DataFrame(rets[df['shares'].diff()<0],
columns=['Returns'])
trans['System'] = k[0]
trans['Ticker'] = k[1]
trans['Long'] = df['long'].shift(1).loc[df['shares'].diff()<0]
trans['Units'] = df['units'].shift(1).loc[df['shares'].diff()<0]
trans['Entry_Price'] = df['entry_price'].shift(1).loc[
df['shares'].diff()<0]
trans['Sell_Price'] = df['Close'].loc[df['shares'].diff()<0]
trans['Shares'] = df['shares'].shift(1).loc[df['shares'].diff()<0]
trans.reset_index(inplace=True)
trans.rename(columns={'index': 'Date'}, inplace=True)
transactions = pd.concat([transactions, trans.copy()])

transactions.reset_index(inplace=True)
transactions.drop('index', axis=1, inplace=True)
return transactions


We’re using YFinance to get our data, which is easy because you can pass a list of tickers and it will pull all of the relevant data and concat them into a dataframe for easy manipulation. We do need to do some reorganization of the multi-level index and calculate the breakouts for each system, long and short, as well as N for every ticker. This is all handled in the _prep_data() method which calls a series of methods to get the data, organize it, and calculate our values.

Another important method is _get_dollar_units(), which returns the available units we have to trade in dollars. Note that we have an extra term in the numerator which adjusts the available capital based on our system allocation. So if we have a 50:50 split between S1 and S2, then our maximum unit is actually 1% of our capital with the default settings. We use this value to check how many units we have devoted to certain positions, to get our pyramiding, and so forth.

The longest method is _run_system() which will run both S1 and S2. This handles the step-by-step details of the Turtle model as laid out above.

run() loops through all of our data to backtest the strategy. There are a few logging functions that take place here as well such as a daily caching of our portfolio for later analysis.

# Sample 10 tickers from S&P 500
url = 'https://en.wikipedia.org/wiki/List_of_S%26P_500_companies'
df = table[0]
syms = df['Symbol']
# Sample symbols
tickers = list(np.random.choice(syms.values, size=10))
print("Ticker Symbols:")
_ = [print(f"\t{i}") for i in tickers]
sys = TurtleSystem(tickers, init_account_size=1E4, start='2000-01-01')
sys.run()

Ticker Symbols:
UAA
HST
UA
CBOE
ATVI
MMM
CTXS
GPC
UHS
PBCT
[*********************100%***********************]  10 of 10 completed


We can plot the returns below against the S&P 500 as a benchmark.

port_values = sys.get_portfolio_values()
returns = port_values / port_values.shift(1)
log_returns = np.log(returns)
cum_rets = log_returns.cumsum()

# Compare to SPY baseline
sp500 = yf.Ticker('SPY').history(start=sys.start, end=sys.end)
sp500['returns'] = sp500['Close'] / sp500['Close'].shift(1)
sp500['log_returns'] = np.log(sp500['returns'])
sp500['cum_rets'] = sp500['log_returns'].cumsum()

plt.figure(figsize=(12, 8))
plt.plot((np.exp(cum_rets) -1 )* 100, label='Turtle Strategy')
plt.plot((np.exp(sp500['cum_rets']) - 1) * 100, label='SPY')
plt.xlabel('Date')
plt.ylabel('Returns (%)')
plt.title('Cumulative Portfolio Returns')
plt.legend()
plt.tight_layout()
plt.show()

stats = getStratStats(log_returns)
spy_stats = getStratStats(sp500['log_returns'])
df_stats = pd.DataFrame(stats, index=['Turtle'])
df_stats = pd.concat([df_stats, pd.DataFrame(spy_stats, index=['SPY'])])
df_stats


The Turtle model outperforms the buy-and-hold approach on the S&P in every category. Surprisingly, the volatility of the Turtle method is lower than the SPY, because the typical knock on trend following is that the returns tend to be “lumpy” as they give back a lot of profit before exiting and hitting large runs again. You can see that pattern at play here as well. Most important is the healthy Sortino ratio which shows good risk-adjusted returns.

Before you get too excited about the results here, note that this backtest doesn’t account for dividends (which would boost both models) and is rife with survivorship bias. We took 10 random stocks from today’s S&P 500, which is comprised of the today’s largest 500 companies. So by sampling from this group of successful companies, we’re likely to do well. You’d need a larger sample size – including de-listed stocks – to develop confidence in this approach.

Regardless, it does look promising and we can dig a little deeper to understand it further.

Let’s look at the transactions.

## Transaction Stats

transactions = sys.get_transactions()
mean = transactions['Returns'].mean() * 100
median = np.median(transactions['Returns']) * 100

colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
plt.figure(figsize=(12, 8))
plt.hist(transactions['Returns']*100, bins=50)
plt.axvline(mean, c=colors[1], label=f'Mean={mean:.1f}%')
plt.axvline(median, c=colors[2], label=f'Median={median:.1f}%')
plt.ylabel('Frequency')
plt.xlabel('Returns (%)')
plt.legend()
plt.show()


The expectation of this strategy is slightly positive with an average return of 0.6%. The mean is pulled above the median by a heavier right tail. Notice those outlier returns on the bottom right? That’s what trend following is banking on – a few big wins to offset many small losses.

So overall, we have a positive expectation, but we can dig a bit deeper. Recall that we have two systems running simultaneously here, longs and shorts, as well as 10 different stocks.

We can look at the histograms and stats for each below.

category = ['System', 'Long']

fig, ax = plt.subplots(1, len(category), figsize=(15, 10), sharey=True)

for i, c in enumerate(category):
vals = transactions[c].unique()
for v in vals:
data = transactions.loc[transactions[c]==v]
ax[i].hist(data['Returns'] * 100, label=v, alpha=0.5, bins=100)
ax[i].set_ylabel('Frequency')

ax[i].set_xlabel('Returns (%)')
ax[i].legend()

ax[0].set_title('Returns from System')
ax[1].set_title('Returns from Long/Short')

plt.tight_layout()
plt.show()


Looking at these histograms, we see that both systems had very similar performance. Long positions, however, are very skewed to the right with all of the big wins coming from that side. Most of the small losses also came from going long on a breakout.

Going short, had the opposite effect. It was a frequent winner, but provided all of the big losses.

Finally, let’s look at some of the stats from the tickers. Even with 10, we have too many to cleanly plot, so we’ll get some summary stats in a table. Feel free to break this down further by system and ticker or system, ticker, and long/short. Just know that you’re going to have a harder time drawing a lot of conclusions as you disaggregate your data further and further due to the diminishing sample size.

Anyway, this is all easy to do:

from scipy.stats import skew

transactions.groupby(['Ticker'])['Returns'].agg(
{'count', 'mean', 'median', 'std', 'skew'})


We see that positive, right-tailed skew from all of these except for UHS, and almost all the tickers provided a positive average return, except UA.

# Turtle Trend Following Forever?

Some of the original Turtles, such as Jerry Parker, have made long and successful careers out of continually and consistently applying trend following rules to markets for decades.

The challenge with trend following – or any quantitative approach is remaining disciplined. It’s easy to get caught up during a drawdown or in the midst of a cold streak and begin to make discretionary decisions. In those times, it’s important to remind yourself of the stats and what you expect your model to do in the long run.

At Raposa, we want to give you the tools to build quantitative models you can rely on. We’re making it easy for traders and investors to build and test rigorous trading models with a no-code solution running professional code with high-quality data.

## Should you Trade with the Kelly Criterion?

The Kelly Criterion gives an optimal result for betting based on the probability of winning a bet and how much you receive for winning. If you check out Wikipedia or Investopedia, you’ll see formulas like this:

f^{*} = p - \frac{1-p}{b-1}

which gives you the optimal amount to bet (f^*) given the probability of winning (p) and the payout you’re given for the bet (b). For example, if you have a bet that gives you a 52% chance of winning and you have 1:1 odds (bet 1 to win 1 and your money back, so you get a total payout of 2), then you should bet 4% of your capital on each game (0.52 – 0.48/(2-1) = 0.04).

This is fine for a binary, win-lose outcome. The trouble is, investing in stocks don’t follow this simple model. If you make a winning trade, you could get a 10% return, 8% return, 6.23%, 214%, or any other value.
So how do we change our binary formula to a continuous model?

## Continuous Kelly

Ed Thorpe and Claude Shannon (yes, the Claude Shannon for us nerds out there) used the Kelly Criterion to manage their black jack bankroll and clean up in Vegas in the 60s. Seeing the applicability of this method, Thorpe extended it to Wall Street using it to manage his investments while running his own hedge fund. Developing a continuous Kelly model isn’t actually very straightforward, but Thorpe offers this series of steps which you can find in Section 7 here.

Most people probably don’t care about the derivation, so we’ll just jump to this new model which winds up being deceptively simple:

f^{*} = \frac{\mu - r}{\sigma^2}

Here, \mu are the mean returns, r is the risk-free rate, and \sigma is the standard deviation of returns. All of this looks very much like the Sharpe Ratio, but instead of dividing by the standard deviation, we divide by the variance.

Now that we have our new formula, let’s put an example strategy or two together to see how it works in theory.

## Long-Only Portfolio with the Kelly Criterion

We’re going to start simple to put this into practice with a long-only portfolio that will rebalance based on the Kelly Criterion. When trading with Kelly position sizing, there are a few things to keep in mind.

First, our Kelly factor (f^*) can go over 1, which implies use of leverage. This may or may not be feasible depending on the trader and leverage may incur additional borrowing costs. Clearly, it increases your risk and volatility as well so not everyone will want to trade with leverage.

Second, we need to estimate the mean and standard deviation of our security. This can be affected by our look-back period. Do we use one year? One month? Or some other time period? One year seems to be standard, so we’ll start with that. Also, the time period you choose may be related to the speed of your trading strategy. If you’re trading minute-by-minute, perhaps a one-month look-back period makes more sense.

Finally, we need to determine how frequently we’ll rebalance our portfolio according to our updated Kelly factor. This can create a lot of transaction costs if it occurs too frequently. There could also be tax implications associated with selling positions quickly, which are beyond the scope of this model. For our purposes, we’ll rebalance every day so that we’re always holding the estimated, optimal position.

Ok, with that out of the way, let’s get some Python packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf


## Calculating the Kelly Criterion for Stocks

Our first function is going to be straightforward. All we’re doing is plugging the mean, standard deviation, and our interest rate into that formula above and returning f-star.

def calcKelly(mean, std, r):
return (mean - r) / std**2


Our example is going to be a simple, vectorized backtest to quickly give you the flavor of using the Kelly formula for position sizing. We’ll write one more helper function that will take our returns as a Pandas Series and return the f-star for each time step.

def getKellyFactor(returns: pd.Series, r=0.01,
max_leverage=None, periods=252, rolling=True):
'''
Calculates the Kelly Factor for each time step based
on the parameters provided.
'''
if rolling:
std = returns.rolling(periods).std()
mean = returns.rolling(periods).mean()
else:
std = returns.expanding(periods).std()
mean = returns.expanding(periods).mean()
r_daily = np.log((1 + r) ** (1 / 252))
kelly_factor = calcKelly(mean, std, r_daily)
# No shorts
kelly_factor = np.where(kelly_factor<0, 0, kelly_factor)
if max_leverage is not None:
kelly_factor = np.where(kelly_factor>max_leverage,
max_leverage, kelly_factor)

return kelly_factor


Let’s use some real data to demonstrate this. I chose the SPY, which is the S&P 500 ETF which was introduced back in 1993. We can get it’s history through 2020 from YFinance with the code below:

ticker = 'SPY'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start='1993-01-01', end='2020-12-31')
# Drop unused columns
data.drop(['Open', 'High', 'Low', 'Volume', 'Dividends',
'Stock Splits'], axis=1, inplace=True)


Now we’re ready to put this into action in a strategy. A key assumption stated by Thorpe in the Kelly formula above is the lack of short selling. With that in mind, we’re going to start with the simplest strategy I can think of, a buy-and-hold strategy that will rebalance the portfolio between cash and equities according the Kelly factor.

For this, we’re going to assign a percentage of our total portfolio to the SPY based on the Kelly factor. If leverage increases above 1, then we will have a negative cash value to reflect our borrowing. I didn’t get very granular, so I just used the same interest rate from 1993-2020 and assumed you can borrow and lend at that same rate – which is really a ridiculous assumption because nobody is going to let you, dear retail investor, pay 1% to lever up your stock portfolio (commodities, FOREX, crypto, and other markets typically have more leverage available). You can get daily 10-year rates or whatever treasury instrument suits you as a baseline from FRED, the US Treasury site, or your favorite data provider to use real data in your Kelly calculation.

Let’s get to the function.

def LongOnlyKellyStrategy(data, r=0.02, max_leverage=None, periods=252,
rolling=True):
data['returns'] = data['Close'] / data['Close'].shift(1)
data['log_returns'] = np.log(data['returns'])
data['kelly_factor'] = getKellyFactor(data['log_returns'],
r, max_leverage, periods, rolling)
cash = np.zeros(data.shape[0])
equity = np.zeros(data.shape[0])
portfolio = cash.copy()
portfolio[0] = 1
cash[0] = 1
for i, _row in enumerate(data.iterrows()):
row = _row[1]
if np.isnan(row['kelly_factor']):
portfolio[i] += portfolio[i-1]
cash[i] += cash[i-1]
continue

portfolio[i] += cash[i-1] * (1 + r)**(1/252) + equity[i-1] * row['returns']
equity[i] += portfolio[i] * row['kelly_factor']
cash[i] += portfolio[i] * (1 - row['kelly_factor'])

data['cash'] = cash
data['equity'] = equity
data['portfolio'] = portfolio
data['strat_returns'] = data['portfolio'] / data['portfolio'].shift(1)
data['strat_log_returns'] = np.log(data['strat_returns'])
data['strat_cum_returns'] = data['strat_log_returns'].cumsum()
data['cum_returns'] = data['log_returns'].cumsum()
return data


We’re going to run a max-leverage strategy and compare it to a baseline, buy-and-hold.

kelly = LongOnlyKellyStrategy(data.copy())

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold')
ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100, label='Kelly Model')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Buy-and-hold and Long-Only Strategy with Kelly Sizing')
ax[0].legend()

ax[1].plot(kelly['kelly_factor'])
ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()


Our model blew up spectacularly! Without any constraints on our leverage, it quickly shot up to 50x leverage – and did great for a bit – but then hit some major losses and plummeted below the baseline. It wound up losing our initial investment fairly quickly.

Important lesson here, while the Kelly Criterion may give you the optimal allocation to trade with, it is only as good as the assumptions that underpin it. Note that this is *not* a predictive tool, we’re looking back in time and assuming the previous year’s volatility is a good guide going forward.

We can do a few different things to de-risk this while gaining the benefits of a better money management strategy. First, we can cap our leverage to something more reasonable, say 3 or 4x (although I’d even keep it lower myself). Let’s see how this impacts our model.

def getStratStats(log_returns: pd.Series,
risk_free_rate: float = 0.02):
stats = {}  # Total Returns
stats['tot_returns'] = np.exp(log_returns.sum()) - 1

# Mean Annual Returns
stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1

# Annual Volatility
stats['annual_volatility'] = log_returns.std() * np.sqrt(252)

# Sortino Ratio
annualized_downside = log_returns.loc[log_returns<0].std() * \
np.sqrt(252)
stats['sortino_ratio'] = (stats['annual_returns'] - \
risk_free_rate) / annualized_downside

# Sharpe Ratio
stats['sharpe_ratio'] = (stats['annual_returns'] - \
risk_free_rate) / stats['annual_volatility']

# Max Drawdown
cum_returns = log_returns.cumsum() - 1
peak = cum_returns.cummax()
drawdown = peak - cum_returns
max_idx = drawdown.argmax()
stats['max_drawdown'] = 1 - np.exp(cum_returns[max_idx]) / np.exp(peak[max_idx])

# Max Drawdown Duration
strat_dd = drawdown[drawdown==0]
strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1]
strat_dd_days = strat_dd_diff.map(lambda x: x.days).values
strat_dd_days = np.hstack([strat_dd_days,
(drawdown.index[-1] - strat_dd.index[-1]).days])
stats['max_drawdown_duration'] = strat_dd_days.max()
return stats

max_leverage = np.arange(1, 6)

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)
data_dict = {}
df_stats = pd.DataFrame()
for l in max_leverage:
kelly = LongOnlyKellyStrategy(data.copy(), max_leverage=l)
data_dict[l] = kelly.copy()

ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100,
label=f'Max Leverage = {l}')

ax[1].plot(kelly['kelly_factor'], label=f'Max Leverage = {l}')

stats = getStratStats(kelly['strat_log_returns'])
df_stats = pd.concat([df_stats,
pd.DataFrame(stats, index=[f'Leverage={l}'])])

ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold',
linestyle=':')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Buy-and-hold and Long-Only Strategy with Kelly Sizing')
ax[0].legend()

ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

stats = pd.DataFrame(getStratStats(kelly['log_returns']), index=['Buy and Hold'])
df_stats = pd.concat([stats, df_stats])
df_stats


There’s a lot to unpack here depending on the leverage ratio. For clarity, a leverage ratio of 1, means that we aren’t actually using any leverage, we’re maxing out by putting all of our portfolio into the SPY. This more conservative model, winds up with lower total returns than the buy and hold approach, but it has a shorter drawdown with less volatility.

Allowing leverage to max out at 2x increases the total returns over buy-and-hold, but with more volatility and lower risk-adjusted returns. It’s a bit tough to see these models in the plot above, so we’ll zoom in on the these results below.

fig, ax = plt.subplots(2, figsize=(15, 8), sharex=True)

ax[0].plot(np.exp(data_dict[1]['strat_cum_returns'])*100,
label='Max Leverage=1')
ax[0].plot(np.exp(data_dict[2]['strat_cum_returns'])*100,
label='Max Leverage=2')
ax[0].plot(np.exp(data_dict[1]['cum_returns'])*100,
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Low-Leverage and Baseline Models')
ax[0].legend()

ax[1].plot(data_dict[1]['kelly_factor'], label='Max Leverage = 1')
ax[1].plot(data_dict[2]['kelly_factor'], label='Max Leverage = 2')
ax[1].set_xlabel('Date')
ax[1].set_ylabel('Leverage')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()


Here, we can see that the Kelly Criterion tends to get out of the market and go to cash as the volatility increases during large drawdowns. The worst of the crashes in 2000 and 2008 are avoided. The COVID crash in 2020, however, was much more rapid and wound up leading to the big losses – particularly the more leveraged the strategy was – as the strategy stayed in the market during the worst of it and got crushed.

Unsurprisingly, the more highly leveraged models all had more volatility with much larger drawdowns. They gave us higher total returns, but the risk metrics all show they did so with a lot more risk.

Above, I mentioned a few ways we can cut down on our leverage beyond just a hard cap.

## Larger Sample Size

We can increase the sample size by increasing the number of periods we use for calculating the mean and standard deviation of our returns. This will lead to a slower reacting model, but may lead to better estimates and thus better results.

max_leverage = 3

periods = 252 * np.arange(1, 5)

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)
data_dict = {}
df_stats = pd.DataFrame()
for p in periods:
p = int(p)
kelly = LongOnlyKellyStrategy(data.copy(), periods=p,
max_leverage=max_leverage)
data_dict[p] = kelly.copy()

ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100,
label=f'Days = {p}')

ax[1].plot(kelly['kelly_factor'], label=f'Days = {p}', linewidth=0.5)

stats = getStratStats(kelly['strat_log_returns'])
df_stats = pd.concat([df_stats,
pd.DataFrame(stats, index=[f'Days={p}'])])

ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold',
linestyle=':')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
'Buy-and-hold and Long-Only Strategy with Kelly Sizing ' +
'and Variable Lookback Periods')
ax[0].legend()

ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

stats = pd.DataFrame(getStratStats(kelly['log_returns']), index=['Buy and Hold'])
df_stats = pd.concat([stats, df_stats])
df_stats


These longer lookback periods tend to increase the time out of market, which extends the drawdown durations in many cases. But they wind up with higher overall returns, at least up to the 4-year lookback period, One thing that makes these comparisons somewhat unfair is that they require more time and data before they enter the market in the first place. So the SPY gets to compounding immediately, the 1-year model gets going after 252 trading days have passed, and so forth. Regardless, we still see good gains from these longer term systems.

One more example of this before moving on. I included an argument called rolling in our function that defaults to True. It’s almost always better to include more data in our estimates, so if we set this to False, we switch out a rolling horizon for an expanding horizon calculation. This latter approach will be identical to the rolling horizon model at day 252 under the standard settings, but then will diverge because it doesn’t drop old data. So the mean and standard deviation of the SPY will contain the last 500 data points at day 500. This quickly increases our sample size as the model moves forward in time. In addition, the number of periods we use now serves as a minimum number of data points required before we get a result for our Kelly factor.

kelly = LongOnlyKellyStrategy(data.copy(), rolling=False)

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold')
ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100, label='Kelly Model')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
'Buy-and-hold and Long-Only Strategy with Kelly Sizing ' +
'and Expanding Horizon')
ax[0].legend()

ax[1].plot(kelly['kelly_factor'])
ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Expanding Kelly Factor')

plt.tight_layout()
plt.show()


In this case, we see that the Kelly factor largely decreases over time as the sample grows. It does begin to increase after 2008, however, it performed very poorly after the 2000 crash – which it was highly levered heading into – and was never able to recover.

## Kelly Sizing and a Trading Strategy

So far we have just looked at applying the Kelly Criterion to a single asset to manage our cash-equity balance. What if we want to use it with a trading strategy on a single asset? Could we improve our risk adjusted returns in this scenario.
The answer is “yes,” but we do need to be careful in how we apply it. If we run the risk of blowing up thanks to leverage while trading the S&P 500, this can be even more of a risk with a given trading strategy.

We’re going to keep things as simple as possible here and run a moving average cross-over strategy. Again, we’ll keep all the assumptions about liquidity, leverage, and no short selling that we laid out above.

There are a couple of changes we’ll need to make to run this. First, we will need to figure out our position, which is just going to be when the short-term SMA is above the long-term SMA.

Second, instead of using the mean and standard deviation of the underlying asset, we’re going to rely on the stats from our strategy. This will use the stats from using our strategy without position sizing.

The code is given below and is similar to what you saw in our long-only strategy.

# Kelly money management for trading strategy
def KellySMACrossOver(data, SMA1=50, SMA2=200, r=0.01,
periods=252, max_leverage=None, rolling=True):
'''
Sizes a simple moving average cross-over strategy according
to the Kelly Criterion.
'''
data['returns'] = data['Close'] / data['Close'].shift(1)
data['log_returns'] = np.log(data['returns'])
# Calculate positions
data['SMA1'] = data['Close'].rolling(SMA1).mean()
data['SMA2'] = data['Close'].rolling(SMA2).mean()
data['position'] = np.nan
data['position'] = np.where(data['SMA1']>data['SMA2'], 1, 0)
data['position'] = data['position'].ffill().fillna(0)

data['_strat_returns'] = data['position'].shift(1) * \
data['returns']
data['_strat_log_returns'] = data['position'].shift(1) * \
data['log_returns']

# Calculate Kelly Factor using the strategy's returns
kf = getKellyFactor(data['_strat_log_returns'], r,
max_leverage, periods, rolling)
data['kelly_factor'] = kf

cash = np.zeros(data.shape[0])
equity = np.zeros(data.shape[0])
portfolio = cash.copy()
portfolio[0] = 1
cash[0] = 1
for i, _row in enumerate(data.iterrows()):
row = _row[1]
if np.isnan(kf[i]):
portfolio[i] += portfolio[i-1]
cash[i] += cash[i-1]
continue

portfolio[i] += cash[i-1] * (1 + r)**(1/252) + equity[i-1] * row['returns']
equity[i] += portfolio[i] * row['kelly_factor']
cash[i] += portfolio[i] * (1 - row['kelly_factor'])

data['cash'] = cash
data['equity'] = equity
data['portfolio'] = portfolio
data['strat_returns'] = data['portfolio'] / data['portfolio'].shift(1)
data['strat_log_returns'] = np.log(data['strat_returns'])
data['strat_cum_returns'] = data['strat_log_returns'].cumsum()
data['cum_returns'] = data['log_returns'].cumsum()
return data


Let’s see how the model performs vs the baseline with moderate leverage.

kelly_sma = KellySMACrossOver(data.copy(), max_leverage=3)

fig, ax = plt.subplots(2, figsize=(15, 8), sharex=True)
ax[0].plot(np.exp(kelly_sma['strat_cum_returns'])* 100, label='SMA-Kelly')
ax[0].plot(np.exp(kelly_sma['_strat_log_returns'].cumsum()) * 100, label='SMA')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Moving Average Cross-Over Strategy with Kelly Sizing')
ax[0].legend()

ax[1].plot(kelly_sma['kelly_factor'])
ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

sma_stats = pd.DataFrame(getStratStats(kelly_sma['log_returns']),
sma_stats = pd.concat([sma_stats,
pd.DataFrame(getStratStats(kelly_sma['strat_log_returns']),
index=['Kelly SMA Model'])])
sma_stats = pd.concat([sma_stats,
pd.DataFrame(getStratStats(kelly_sma['_strat_log_returns']),
index=['SMA Model'])])
sma_stats


The Kelly SMA Model doubles the buy and hold approach in terms of total returns. Like the others, it was in a leveraged long position heading into the COVID crash and got crushed. Moreover, from a risk-adjusted return basis, it performs worse than either a basic SMA model or the buy and hold strategy.

## Trading with the Kelly Criterion

Leverage can be a powerful and dangerous tool by amplifying both gains and losses. Each of the strategies we ran without a cap on the leverage ratio blew up at some point, highlighting the dangers of such an approach. Most traders who do use the Kelly Criterion in their position sizing only trade half or quarter Kelly, i.e. with 50% or 25% of the Kelly factor size. This is to control risk and avoid blowing up, which is a fate much worse than underperforming the market for a few years.

It’s important to note that the Kelly Criterion is not predictive. While it may optimize the long-run growth of your returns, we see that it falls down time and time again when unconstrained because it is looking backwards over a small sample size. To get the most out of it, we’d need to use it in a constrained setting on a diversified strategy over many markets. That way, we’d be developing our stats based on the performance of the strategy, giving us a larger sample size and better estimate while also limiting our leverage. We’ll look at strategies like this in future posts.

For now, if you want to keep up with what we’re doing at Raposa, sign up with your email below!

## Test and Trade RSI Divergence in Python

Divergences occur when price and your indicator move in opposite directions. For example, you’re trading with the RSI and it last had a peak at 80, now it peaks at 70. The underlying security you’re trading was at $14 when RSI hit 80, and now hits a new peak at$18. This is a divergence.

Traders will refer to the price as reaching a “higher high” and the RSI as a “lower high” because of the trend of the peaks. Technical traders track these visually – but it can be difficult to replicate because it isn’t always clear what exactly makes a “peak.” We provide an algorithmic way to detect peaks and troughs for trading, which we’ll leverage below as we get into the details of building an RSI divergence strategy.

## Divergences as Entry Signals

Divergences are often referred to as being either “bearish” or “bullish.” A bearish divergence is like the one we saw in the example above. We have a momentum indicator that is weakening ahead of the price which gives us a point to jump in and short it. A bullish divergence is where we have higher lows occurring for our momentum indicator, but lower lows in the price.

Under this interpretation, divergences are meant to be leading indicators – the divergence occurs before the price action confirms it. In practice, this is a bit more challenging to pull off because you find yourself looking for peaks in price and indicator, and a peak isn’t known to be a peak until some more time passes so you can see if the value decreases.

At any rate, let’s get to some code to illustrate how this works!

## Detecting Divergences

The first step is going to require importing a few packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
from scipy.signal import argrelextrema
from collections import deque


If you follow this blog you’ll notice that there are a few extra we’re bringing in here. argrelextrema is used for detecting peaks in SciPy’s signal processing library, and deque is like a fixed-length list that will drop the oldest entry and keep the new ones if you exceed its length. We’ll use the first to spot our extrema in our data, then cycle through them and keep the points that are higher than the previous entries.

To spot our extrema, we need to pass an argument called order. This defines how many points on either side of our peak we need to actually label something a peak. So with order=5, we need something to be the highest point within 5-data points to the right and left. The other argument we provide is K, which is simply an integer to determine how many consecutive peaks we want to identify to determine a trend of higher highs.

The full, higher-high detection function is given below.

def getHigherHighs(data: np.array, order=5, K=2):
'''
Finds consecutive higher highs in price pattern.
Must not be exceeded within the number of periods indicated by the width
parameter for the value to be confirmed.
K determines how many consecutive highs need to be higher.
'''
# Get highs
high_idx = argrelextrema(data, np.greater, order=order)[0]
highs = data[high_idx]
# Ensure consecutive highs are higher than previous highs
extrema = []
ex_deque = deque(maxlen=K)
for i, idx in enumerate(high_idx):
if i == 0:
ex_deque.append(idx)
continue
if highs[i] < highs[i-1]:
ex_deque.clear()

ex_deque.append(idx)
if len(ex_deque) == K:
extrema.append(ex_deque.copy())

return extrema


This returns a list of deques containing indices for our peaks. To get all of the relevant combinations for identifying divergences, we need four such functions one for higher highs (above), lower lows, lower highs, and higher lows. The logic for each of these is identical, we just change out np.greater for np.less in line 9 and change the inequality sign in line 18 to get the behavior we want. To keep things short, I’m not going to provide all the code in this post, but you can find each of these functions and additional explanations here.

We need some data, so we’ll pull that from the Yahoo! Finance API using the yfinance package. I’m going to use ExxonMobil (XOM) because it has seen a fair share of booms and busts over the past few decades, but most importantly, I just got gas there.

start = '2011-01-01'
end = '2011-07-31'

ticker = 'XOM'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start, end=end)
# Drop unused columns
data.drop(['Open', 'High', 'Low', 'Volume', 'Dividends',
'Stock Splits'], axis=1, inplace=True)



Now we can calculate all of our extrema and plot the results.

from matplotlib.lines import Line2D # For legend

price = data['Close'].values
dates = data.index

# Get higher highs, lower lows, etc.
order = 5
hh = getHigherHighs(price, order)
lh = getLowerHighs(price, order)
ll = getLowerLows(price, order)
hl = getHigherLows(price, order)

# Get confirmation indices
hh_idx = np.array([i[1] + order for i in hh])
lh_idx = np.array([i[1] + order for i in lh])
ll_idx = np.array([i[1] + order for i in ll])
hl_idx = np.array([i[1] + order for i in hl])

# Plot results
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

plt.figure(figsize=(12, 8))
plt.plot(data['Close'])
plt.scatter(dates[hh_idx], price[hh_idx-order], marker='^', c=colors[1])
plt.scatter(dates[lh_idx], price[lh_idx-order], marker='v', c=colors[2])
plt.scatter(dates[ll_idx], price[ll_idx-order], marker='v', c=colors[3])
plt.scatter(dates[hl_idx], price[hl_idx-order], marker='^', c=colors[4])
_ = [plt.plot(dates[i], price[i], c=colors[1]) for i in hh]
_ = [plt.plot(dates[i], price[i], c=colors[2]) for i in lh]
_ = [plt.plot(dates[i], price[i], c=colors[3]) for i in ll]
_ = [plt.plot(dates[i], price[i], c=colors[4]) for i in hl]

plt.xlabel('Date')
plt.ylabel('Price ($)') plt.title(f'Potential Divergence Points for {ticker} Closing Price') legend_elements = [ Line2D([0], [0], color=colors[0], label='Close'), Line2D([0], [0], color=colors[1], label='Higher Highs'), Line2D([0], [0], color='w', marker='^', markersize=10, markerfacecolor=colors[1], label='Higher High Confirmation'), Line2D([0], [0], color=colors[2], label='Higher Lows'), Line2D([0], [0], color='w', marker='^', markersize=10, markerfacecolor=colors[2], label='Higher Lows Confirmation'), Line2D([0], [0], color=colors[3], label='Lower Lows'), Line2D([0], [0], color='w', marker='v', markersize=10, markerfacecolor=colors[3], label='Lower Lows Confirmation'), Line2D([0], [0], color=colors[4], label='Lower Highs'), Line2D([0], [0], color='w', marker='^', markersize=10, markerfacecolor=colors[4], label='Lower Highs Confirmation') ] plt.legend(handles=legend_elements, bbox_to_anchor=(1, 0.65)) plt.show()  In this plot, we pull out all of our potential divergence points and map the highs and lows to the price. Also, notice that I plotted confirmation points for each of the peaks. This goes back to that order value, we have no idea if a peak is actually a peak until we give it a few days (5, in this case) to see what the price does next. The price chart is just half of what we need for divergences, we also need to apply an indicator. Drawing on Kauffman’s excellent Trading Systems and Methods, we ought to use some kind of momentum indicator. We’ll go ahead and apply the RSI, although the MACD, stochastics, and so forth, would be applicable as well. ## Peaks and Valleys of the RSI The RSI is most frequently interpreted as showing upward momentum when the value is above the center line (RSI=50), and downward momentum when it is below the center line. If we have a series of smaller peaks above 50, it could indicate a reduction in momentum, and a series of increasing valleys below 50 could be a sign of increasing momentum that we could trade. The next step for us then is to calculate the RSI, then apply the same technique as shown above to extract the relevant extrema. Code for the RSI is taken from this post if you’d like to get into the details and see some examples. def calcRSI(data, P=14): data['diff_close'] = data['Close'] - data['Close'].shift(1) data['gain'] = np.where(data['diff_close']>0, data['diff_close'], 0) data['loss'] = np.where(data['diff_close']<0, np.abs(data['diff_close']), 0) data[['init_avg_gain', 'init_avg_loss']] = data[ ['gain', 'loss']].rolling(P).mean() avg_gain = np.zeros(len(data)) avg_loss = np.zeros(len(data)) for i, _row in enumerate(data.iterrows()): row = _row[1] if i < P - 1: last_row = row.copy() continue elif i == P-1: avg_gain[i] += row['init_avg_gain'] avg_loss[i] += row['init_avg_loss'] else: avg_gain[i] += ((P - 1) * avg_gain[i-1] + row['gain']) / P avg_loss[i] += ((P - 1) * avg_loss[i-1] + row['loss']) / P last_row = row.copy() data['avg_gain'] = avg_gain data['avg_loss'] = avg_loss data['RS'] = data['avg_gain'] / data['avg_loss'] data['RSI'] = 100 - 100 / (1 + data['RS']) return data  With that function in place, we can the RSI and its related columns to our data frame with: data = calcRSI(data.copy()) # Get values to mark RSI highs/lows and plot rsi_hh = getHigherHighs(rsi, order) rsi_lh = getLowerHighs(rsi, order) rsi_ll = getLowerLows(rsi, order) rsi_hl = getHigherLows(rsi, order)  We’ll follow the same format as above to plot our results: fig, ax = plt.subplots(2, figsize=(20, 12), sharex=True) ax[0].plot(data['Close']) ax[0].scatter(dates[hh_idx], price[hh_idx-order], marker='^', c=colors[1]) ax[0].scatter(dates[lh_idx], price[lh_idx-order], marker='v', c=colors[2]) ax[0].scatter(dates[hl_idx], price[hl_idx-order], marker='^', c=colors[3]) ax[0].scatter(dates[ll_idx], price[ll_idx-order], marker='v', c=colors[4]) _ = [ax[0].plot(dates[i], price[i], c=colors[1]) for i in hh] _ = [ax[0].plot(dates[i], price[i], c=colors[2]) for i in lh] _ = [ax[0].plot(dates[i], price[i], c=colors[3]) for i in hl] _ = [ax[0].plot(dates[i], price[i], c=colors[4]) for i in ll] ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'Price and Potential Divergence Points for {ticker}')
ax[0].legend(handles=legend_elements)

ax[1].plot(data['RSI'])
ax[1].scatter(dates[rsi_hh_idx], rsi[rsi_hh_idx-order],
marker='^', c=colors[1])
ax[1].scatter(dates[rsi_lh_idx], rsi[rsi_lh_idx-order],
marker='v', c=colors[2])
ax[1].scatter(dates[rsi_hl_idx], rsi[rsi_hl_idx-order],
marker='^', c=colors[3])
ax[1].scatter(dates[rsi_ll_idx], rsi[rsi_ll_idx-order],
marker='v', c=colors[4])
_ = [ax[1].plot(dates[i], rsi[i], c=colors[1]) for i in rsi_hh]
_ = [ax[1].plot(dates[i], rsi[i], c=colors[2]) for i in rsi_lh]
_ = [ax[1].plot(dates[i], rsi[i], c=colors[3]) for i in rsi_hl]
_ = [ax[1].plot(dates[i], rsi[i], c=colors[4]) for i in rsi_ll]

ax[1].set_ylabel('RSI')
ax[1].set_title(f'RSI and Potential Divergence Points for {ticker}')
ax[1].set_xlabel('Date')

plt.tight_layout()
plt.show()


This is only a short, 7-month window so we can clearly see the moves of both the price and RSI, so there is only one divergence visible. We see confirmation of higher lows in mid-June on the RSI chart (orange, upward pointing triangle) in the midst of a long series of lower lows in the price chart (blue, downward pointing triangles). We don’t trade off of charts, so let’s put an algorithm together to test this RSI divergence model.

## Building an RSI Divergence Model

So far we have some general rules for identifying cases where we have divergences, but we still need entry and exit rules. For starters, we can turn to Kaufmann’s excellent Trading Systems and Methods where he lays out an example strategy with the following rules:

1. Enter a position when the divergence has been identified if the indicator is above the target level (e.g. RSI = 50).
2. Exit if the indicator divergence disappears. If we short while the price makes a higher high and the RSI makes a lower high, then our RSI moves to a higher high, then we’re out.
3. Exit once the indicator has reached the target level.
4. Allow the divergence to convert to a trend position. For this, we use a separate trend indicator (e.g. EMA cross-over) and we hold the position if the trend is in the same direction as the divergence. If the divergence disappears but the trend continues we hold, and exit only when the trend disappears.

We’re going to build two models off of Kaufmann’s rules, one that is only trading the divergences (rules 1-3) and one that has divergence plus trend (all 4 rules). Of course, feel free to modify these as you see fit and experiment with a variety of approaches yourself.

Next, I’m going to build some helper functions to mark our peaks. The first set are going to modify the output of our getHigherHighs group of functions. These were built for the vizualizations above, but we just need to extract the confirmation points of the trends for our model. Also note that because we’re adding order to our index, we could get confirmation points that will raise index errors, so we drop any indices that are greater than the number of data points we have.

The four functions are below:

def getHHIndex(data: np.array, order=5, K=2):
extrema = getHigherHighs(data, order, K)
idx = np.array([i[-1] + order for i in extrema])
return idx[np.where(idx<len(data))]

def getLHIndex(data: np.array, order=5, K=2):
extrema = getLowerHighs(data, order, K)
idx = np.array([i[-1] + order for i in extrema])
return idx[np.where(idx<len(data))]

def getLLIndex(data: np.array, order=5, K=2):
extrema = getLowerLows(data, order, K)
idx = np.array([i[-1] + order for i in extrema])
return idx[np.where(idx<len(data))]

def getHLIndex(data: np.array, order=5, K=2):
extrema = getHigherLows(data, order, K)
idx = np.array([i[-1] + order for i in extrema])
return idx[np.where(idx<len(data))]


To reduce re-writing code, I’m going to introduce a function called getPeaks which takes our data frame and encodes the output of our highs and lows into column vectors. It will use the four functions we defined above and assign a value of 1 from the point we hit higher highs into the Close_highs column. If our highs are trending down after confirming a lower high, then we mark that with a -1 in the same column. It will do the same for the lows. It will be important to remember which values have a 1 and which have a -1, so I made it a 1 if the trend is increasing (higher highs or higher lows) and -1 if the trend is decreasing (lower highs or lower lows).

def getPeaks(data, key='Close', order=5, K=2):
vals = data[key].values
hh_idx = getHHIndex(vals, order, K)
lh_idx = getLHIndex(vals, order, K)
ll_idx = getLLIndex(vals, order, K)
hl_idx = getHLIndex(vals, order, K)

data[f'{key}_highs'] = np.nan
data[f'{key}_highs'][hh_idx] = 1
data[f'{key}_highs'][lh_idx] = -1
data[f'{key}_highs'] = data[f'{key}_highs'].ffill().fillna(0)
data[f'{key}_lows'] = np.nan
data[f'{key}_lows'][ll_idx] = 1
data[f'{key}_lows'][hl_idx] = -1
data[f'{key}_lows'] = data[f'{key}_highs'].ffill().fillna(0)
return data


Finally, we can build our strategy. Here, we’re just following the first 3 rules laid out above.

def RSIDivergenceStrategy(data, P=14, order=5, K=2):
'''
Go long/short on price and RSI divergence.
- Long if price to lower low and RSI to higher low with RSI < 50
- Short if price to higher high and RSI to lower high with RSI > 50
Sell if divergence disappears.
Sell if the RSI crosses the centerline.
'''
data = getPeaks(data, key='Close', order=order, K=K)
data = calcRSI(data, P=P)
data = getPeaks(data, key='RSI', order=order, K=K)

position = np.zeros(data.shape[0])
# position[:] = np.nan
for i, (t, row) in enumerate(data.iterrows()):
if np.isnan(row['RSI']):
continue
# If no position is on
if position[i-1] == 0:
# Buy if indicator to higher low and price to lower low
if row['Close_lows'] == -1 and row['RSI_lows'] == 1:
if row['RSI'] < 50:
position[i] = 1
entry_rsi = row['RSI'].copy()

# Short if price to higher high and indicator to lower high
elif row['Close_highs'] == 1 and row['RSI_highs'] == -1:
if row['RSI'] > 50:
position[i] = -1
entry_rsi = row['RSI'].copy()

# If current position is long
elif position[i-1] == 1:
if row['RSI'] < 50 and row['RSI'] < entry_rsi:
position[i] = 1

# If current position is short
elif position[i-1] == -1:
if row['RSI'] < 50 and row['RSI'] > entry_rsi:
position[i] = -1

data['position'] = position
return calcReturns(data)

def calcReturns(df):
# Helper function to avoid repeating too much code
df['returns'] = df['Close'] / df['Close'].shift(1)
df['log_returns'] = np.log(df['returns'])
df['strat_returns'] = df['position'].shift(1) * df['returns']
df['strat_log_returns'] = df['position'].shift(1) * df['log_returns']
df['cum_returns'] = np.exp(df['log_returns'].cumsum()) - 1
df['strat_cum_returns'] = np.exp(df['strat_log_returns'].cumsum()) - 1
df['peak'] = df['cum_returns'].cummax()
df['strat_peak'] = df['strat_cum_returns'].cummax()
return df


One thing to note on the exit conditions, we’re to wait for a change in the trend. Rather than waiting for 5 days for a confirmation from a peak in the RSI, I added a condition that states if the RSI breaks below our entry RSI on a long position or above our entry RSI on a short position, we should get out. This works because if we short on a lower high in the RSI, then we’re going to exit if that reverses. If the RSI closes above our entry RSI, then either that becomes a higher high, thereby breaking our trend, or a higher high will still come. Putting this condition in just gets us out of the trade much more quickly.

Ok, enough explanation, let’s test it on data from 2000-2020.

start = '2000-01-01'
end = '2020-12-31'
data = yfObj.history(start=start, end=end)
# Drop unused columns
data.drop(['Open', 'High', 'Low', 'Volume', 'Dividends',
'Stock Splits'], axis=1, inplace=True)

df_div = RSIDivergenceStrategy(data.copy())

plt.figure(figsize=(12, 8))
plt.plot(df_div['strat_cum_returns'] * 100, label='RSI Divergence')
plt.xlabel('Date')
plt.ylabel('Returns (%)')
plt.title(f'Buy-and-Hold and RSI Divergence Returns for {ticker}')
plt.legend()
plt.show()

df_stats = pd.DataFrame(getStratStats(df_div['log_returns']),
df_stats = pd.concat([df_stats,
pd.DataFrame(getStratStats(df_div['strat_log_returns']),
index=['Divergence'])])

df_stats


In the end, the divergence strategy wound up outperforming a buy-and-hold approach (ignoring dividends, which Exxon does pay). It did so with less volatility and smaller drawdowns, but it did underperform from 2004-2020. In other words, you’d be waiting for 16 years with what looks like a losing strategy against the underlying before just breaking above in 2020. This strategy might work better elsewhere or fit into a diversified portfolio, but at least in this case, a pure RSI divergence strategy doesn’t look great.

## RSI Divergence and Trend

For this next model, let’s take Kaufman’s suggestion and apply a trend conversion. For this, we’re going to choose an EMA cross-over. So, the model will trade just like the divergence model we saw above, but will check the trend as indicated by our EMA cross-over. If we’re long and EMA1 > EMA2, we keep that position going.

Code for the EMA calculation and the strategy are given below:

def _calcEMA(P, last_ema, N):
return (P - last_ema) * (2 / (N + 1)) + last_ema

def calcEMA(data, N):
# Initialize series
data['SMA_' + str(N)] = data['Close'].rolling(N).mean()
ema = np.zeros(len(data))
for i, _row in enumerate(data.iterrows()):
row = _row[1]
if i < N:
ema[i] += row['SMA_' + str(N)]
else:
ema[i] += _calcEMA(row['Close'], ema[i-1], N)
data['EMA_' + str(N)] = ema.copy()
return data

def RSIDivergenceWithTrendStrategy(data, P=14, order=5, K=2, EMA1=50, EMA2=200):
'''
Go long/short on price and RSI divergence.
- Long if price to lower low and RSI to higher low with RSI < 50
- Short if price to higher high and RSI to lower high with RSI > 50
Sell if divergence disappears or if the RSI crosses the centerline, unless
there is a trend in the same direction.
'''
data = getPeaks(data, key='Close', order=order, K=K)
data = calcRSI(data, P=P)
data = getPeaks(data, key='RSI', order=order, K=K)
data = calcEMA(data, EMA1)
data = calcEMA(data, EMA2)

position = np.zeros(data.shape[0])
# position[:] = np.nan
for i, (t, row) in enumerate(data.iterrows()):
if np.isnan(row['RSI']):
continue
# If no position is on
if position[i-1] == 0:
# Buy if indicator to higher high and price to lower high
if row['Close_lows'] == -1 and row['RSI_lows'] == 1:
if row['RSI'] < 50:
position[i] = 1
entry_rsi = row['RSI'].copy()

# Short if price to higher high and indicator to lower high
elif row['Close_highs'] == 1 and row['RSI_highs'] == -1:
if row['RSI'] > 50:
position[i] = -1
entry_rsi = row['RSI'].copy()

# If current position is long
elif position[i-1] == 1:
if row['RSI'] < 50 and row['RSI'] < entry_rsi:
position[i] = 1
elif row[f'EMA_{EMA1}'] > row[f'EMA_{EMA2}']:
position[i] = 1

# If current position is short
elif position[i-1] == -1:
if row['RSI'] < 50 and row['RSI'] > entry_rsi:
position[i] = -1
elif row[f'EMA_{EMA1}'] < row[f'EMA_{EMA2}']:
position[i] = -1

data['position'] = position
return calcReturns(data)


Running this model on our data we get:

plt.figure(figsize=(12, 8))
plt.plot(df_trend['strat_cum_returns'] * 100, label='RSI Div + Trend')
plt.xlabel('Date')
plt.ylabel('Returns (%)')
plt.title(f'Buy-and-Hold and Divergence with Trend Returns for {ticker}')
plt.legend()
plt.show()

df_trend = RSIDivergenceWithTrendStrategy(data.copy())
df_stats = pd.concat([df_stats,
pd.DataFrame(getStratStats(df_trend['strat_log_returns']),
index=['Div + Trend'])])
df_stats


Adding our trend indicator greatly increased our returns. It did so against the underlying with less volatility (although more than the RSI Divergence strategy) and higher risk adjust returns. The max drawdown was less than the largest experienced by the underlying and shorter in duration.

We looked at coding and trading two RSI divergence strategies, one was great and the other wasn’t. Does this mean you should go out and trade the RSI divergence with an EMA cross-over?

No.

500% returns look great, but keep in mind that we did this on a single instrument, without money or risk management principles, just with the basic settings, in a vectorized backtest that doesn’t account for transaction costs or dividends, and with free data that may or may not be reliable.

This is here to give you some ideas and explanation about these indicators. These quick and dirty backtests are useful because you get a feel for how to test ideas and can go out and test it on a variety of securities and markets to begin to narrow down your options. Maybe a more rigorous test shows the RSI divergence is a really valuable part of your system and the trend model is an outlier. You’ll never know unless you test it!

## 4 Simple Strategies to Trade Bollinger Bands

Bollinger Bands have been a popular indicator by traders since they were invented in the early 1980’s. They’re calculated in four, easy steps and are intended to provide traders an idea of the price range of a security.

We can use these to develop a number of different algorithmic strategies to test. Below, we walk through 4 different trading strategies relying on the bands for mean reversion and trend following depending on how we decide to interpret the signals.

# Bollinger Bands and Mean Reversion

For the standard Bollinger Band set-up, we’re looking at a 20-day moving average of the typical price (SMA(TP)) +/- 2 standard deviations (2\sigma). If the typical price follows a normal distribution (i.e. Bell curve), then it has a ~5% chance of moving 2 or more standard deviations away from the mean. In other words, we’ve got a 1/20 chance of reaching the edge of our standard Bollinger Band.

The mean reversion trader looks at this and wants to put a bet on that the price will move back towards the SMA(TP) in the near term. So if we hit the upper Bollinger Band (UBB) we go short, if we hit the lower (LBB) we go long and hold on until we reach the SMA(TP).

Hopefully that basic intuition makes sense to you because we’re going to jump into coding this strategy up and seeing how it performs.

Let’s import some basic packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf


Next I’m going to write a function to calculate our Bollinger Bands.

def calcBollingerBand(data, periods=20, m=2, label=None):
'''Calculates Bollinger Bands'''
keys = ['UBB', 'LBB', 'TP', 'STD', 'TP_SMA']
if label is None:
ubb, lbb, tp, std, tp_sma = keys
else:
ubb, lbb, tp, std, tp_sma = [i + '_' + label for i in keys]
data[tp] = data.apply(
lambda x: np.mean(x[['High', 'Low', 'Close']]), axis=1)
data[tp_sma] = data[tp].rolling(periods).mean()
data[std] = data[tp].rolling(periods).std(ddof=1)
data[ubb] = data[tp_sma] + m * data[std]
data[lbb] = data[tp_sma] - m * data[std]
return data


This is going to take our data frame from YFinance and calculate all of our necessary intermediate values, then output typical price (TP), SMA(TP) (TP_SMA), upper Bollinger Band (UBB), and lower Bollinger Band (LBB). In addition to our data, it needs the number of periods (periods) and number of standard deviations (m) we use in our calculations. I also added an optional label argument that will update the keys in the data frame because some strategies we’ll look at use two sets of Bollinger Bands and we don’t want the values to be overwritten when we make our calculations.

Next, we’ll write our mean reversion strategy.

def BBMeanReversion(data, periods=20, m=2, shorts=True):
'''
Buy/short when price moves outside of bands.
Exit position when price crosses SMA(TP).
'''
data = calcBollingerBand(data, periods, m)
# Get points where price crosses SMA(TP)
xs = (data['Close'] - data['TP_SMA']) / \
(data['Close'].shift(1) - data['TP_SMA'].shift(1))

data['position'] = np.nan
data['position'] = np.where(data['Close']<=data['LBB'], 1,
data['position'])
if shorts:
data['position'] = np.where(data['Close']>=data['UBB'], -1,
data['position'])
else:
data['position'] = np.where(data['Close']>=data['UBB'], 0,
data['position'])

# Exit when price crosses SMA(TP)
data['position'] = np.where(np.sign(xs)==-1, 0, data['position'])

data['position'] = data['position'].ffill().fillna(0)
return calcReturns(data)


This strategy is going to go long when the price moves to the LBB and go short (if shorts=True) when the price reaches the UBB. It will sell if the price crosses the SMA(TP). We do this by looking for a sign change in the difference between the closing price and the SMA(TP) from one day to the next.

At the end, you’ll see I don’t simply return the data, but I wrap it in a calcReturns function. This is a helper function that makes it easy to get the returns for our strategy and the buy-and-hold baseline we’ll benchmark ourselves against. The code for this is given below.

def calcReturns(df):
# Helper function to avoid repeating too much code
df['returns'] = df['Close'] / df['Close'].shift(1)
df['log_returns'] = np.log(df['returns'])
df['strat_returns'] = df['position'].shift(1) * df['returns']
df['strat_log_returns'] = df['position'].shift(1) * \
df['log_returns']
df['cum_returns'] = np.exp(df['log_returns'].cumsum()) - 1
df['strat_cum_returns'] = np.exp(df['strat_log_returns'].cumsum()) - 1
df['peak'] = df['cum_returns'].cummax()
df['strat_peak'] = df['strat_cum_returns'].cummax()
return df


Now we just need our data and we can take a look at how this strategy performs. I’m going to just grab some data from the S&P 500 and test it over 21 years from 2000-2020.

table = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
syms = df['Symbol']
# Sample symbols
ticker = np.random.choice(syms.values)
print(f"Ticker Symbol: {ticker}")

start = '2000-01-01'
end = '2020-12-31'

# Get Data
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start, end=end)
# Drop unused columns
data.drop(['Open', 'Volume', 'Dividends',
'Stock Splits'], inplace=True, axis=1)

df_rev = BBMeanReversion(data.copy(), shorts=True)

colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)

ax[0].plot(df_rev['Close'], label='Close')
ax[0].plot(df_rev['TP_SMA'], label='SMA(TP)')
ax[0].plot(df_rev['UBB'], color=colors[2])
ax[0].plot(df_rev['LBB'], color=colors[2])
ax[0].fill_between(df_rev.index, df_rev['UBB'], df_rev['LBB'],
alpha=0.3, color=colors[2], label='Bollinger Band')

ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and Indicators for {ticker}') ax[0].legend() ax[1].plot(df_rev['cum_returns']*100, label='Buy and Hold') ax[1].plot(df_rev['strat_cum_returns']*100, label='Mean Reversion') ax[1].set_xlabel('Date') ax[1].set_ylabel('Returns (%)') ax[1].set_title(f'Buy and Hold and Mean Reversion Returns for {ticker}') ax[1].legend() plt.tight_layout() plt.show()  It’s a little difficult to see the Bollinger Band and its interface with the price at this 21-year scale. But it is easy to see that this strategy fell flat vs a simple, buy and hold approach. Let’s use another helper function to get the statistics for both of these so we can go a little deeper. def getStratStats(log_returns: pd.Series, risk_free_rate: float = 0.02): stats = {} # Total Returns stats['tot_returns'] = np.exp(log_returns.sum()) - 1 # Mean Annual Returns stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1 # Annual Volatility stats['annual_volatility'] = log_returns.std() * np.sqrt(252) # Sortino Ratio annualized_downside = log_returns.loc[log_returns<0].std() * \ np.sqrt(252) stats['sortino_ratio'] = (stats['annual_returns'] - \ risk_free_rate) / annualized_downside # Sharpe Ratio stats['sharpe_ratio'] = (stats['annual_returns'] - \ risk_free_rate) / stats['annual_volatility'] # Max Drawdown cum_returns = log_returns.cumsum() - 1 peak = cum_returns.cummax() drawdown = peak - cum_returns stats['max_drawdown'] = drawdown.max() # Max Drawdown Duration strat_dd = drawdown[drawdown==0] strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1] strat_dd_days = strat_dd_diff.map(lambda x: x.days).values strat_dd_days = np.hstack([strat_dd_days, (drawdown.index[-1] - strat_dd.index[-1]).days]) stats['max_drawdown_duration'] = strat_dd_days.max() return stats  bh_stats = getStratStats(df_rev['log_returns']) rev_stats = getStratStats(df_rev['strat_log_returns']) df = pd.DataFrame(bh_stats, index=['Buy and Hold']) df = pd.concat([df, pd.DataFrame(rev_stats, index=['Mean Reversion'])]) df  Overall, this strategy had a bad go of it, losing just about all of our starting capital. It wasn’t always a loser, below we can see the annual performance of the strategy vs the buy and hold approach. df_rev['year'] = df_rev.index.map(lambda x: x.year) ann_rets = df_rev.groupby('year')[['log_returns', 'strat_log_returns']].sum() ann_rets.columns = ['Buy and Hold Returns', 'Mean Reversion Returns'] ann_rets.apply(lambda x: np.exp(x).round(4) -1, axis=1)* 100  Our mean reversion model was either really hot or really cold. From 2003-2009, it was simply compounding at terrible rates year after year making it impossible to ever come back. Also, we can see that this stock, as well as the strategy, had very high volatility – often times a good thing for these strategies – but it was caught on the wrong side of the moves far too often. long_entry = df_rev.loc[(df_rev['position']==1) & (df_rev['position'].shift(1)==0)]['Close'] long_exit = df_rev.loc[(df_rev['position']==0) & (df_rev['position'].shift(1)==1)]['Close'] short_entry = df_rev.loc[(df_rev['position']==-1) & (df_rev['position'].shift(1)==0)]['Close'] short_exit = df_rev.loc[(df_rev['position']==0) & (df_rev['position'].shift(1)==-1)]['Close'] colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] fig, ax = plt.subplots(1, figsize=(15, 10), sharex=True) ax.plot(df_rev['Close'], label='Close') ax.plot(df_rev['TP_SMA'], label='SMA(TP)') ax.plot(df_rev['UBB'], color=colors[2]) ax.plot(df_rev['LBB'], color=colors[2]) ax.fill_between(df_rev.index, df_rev['UBB'], df_rev['LBB'], alpha=0.3, color=colors[2], label='Bollinger Band') ax.scatter(long_entry.index, long_entry, c=colors[4], s=100, marker='^', label='Long Entry', zorder=10) ax.scatter(long_exit.index, long_exit, c=colors[4], s=100, marker='x', label='Long Exit', zorder=10) ax.scatter(short_entry.index, short_entry, c=colors[3], s=100, marker='^', label='Short Entry', zorder=10) ax.scatter(short_exit.index, short_exit, c=colors[3], s=100, marker='x', label='Short Exit', zorder=10) ax.set_ylabel('Price ($)')
ax.set_title(
f'Price and Bollinger Band Mean Reversion Strategy for {ticker} in 2002')
ax.legend()

ax.set_ylim([0, 20])
ax.set_xlim([pd.to_datetime('2002-01-01'), pd.to_datetime('2002-12-31')])
plt.tight_layout()
plt.show()


Mean reversion performed poorly, but we can change to a trend-following model that buys when the price moves above the upper Band.

def BBBreakout(data, periods=20, m=1, shorts=True):
'''
Buy/short when price moves outside of the upper band.
Exit when the price moves into the band.
'''
data = calcBollingerBand(data, periods, m)

data['position'] = np.nan
data['position'] = np.where(data['Close']>data['UBB'], 1, 0)
if shorts:
data['position'] = np.where(data['Close']<data['LBB'], -1, data['position'])
data['position'] = data['position'].ffill().fillna(0)

return calcReturns(data)

df_break = BBBreakout(data.copy())

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)

ax[0].plot(df_break['Close'], label='Close')
ax[0].plot(df_break['UBB'], color=colors[2])
ax[0].plot(df_break['LBB'], color=colors[2])
ax[0].fill_between(df_break.index, df_break['UBB'], df_break['LBB'],
alpha=0.3, color=colors[2], label='Bollinger Band')
ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and Bolling Bands for {ticker}') ax[0].legend() ax[1].plot(df_break['cum_returns'] * 100, label='Buy and Hold') ax[1].plot(df_break['strat_cum_returns'] * 100, label='Breakout') ax[1].set_xlabel('Date') ax[1].set_ylabel('Returns (%)') ax[1].set_title('Cumulative Returns for Breakout Strategy and Buy and Hold') ax[1].legend() plt.show() break_stats = getStratStats(df_break['strat_log_returns']) df = pd.concat([df, pd.DataFrame(break_stats, index=['Breakout'])]) df  This strategy just blows away the baseline by multiplying the starting capital by 37 times! The Sharpe and Sortino ratios look reasonable as well. The big thing, however is that huge drawdown after the strategy spiked in 2018 and gave almost all of that back in the subsequent years. Let’s look a little more closely. long_entry = df_break.loc[(df_break['position']==1) & (df_break['position'].shift(1)!=1)]['Close'] long_exit = df_rev.loc[(df_break['position']!=1) & (df_break['position'].shift(1)==1)]['Close'] short_entry = df_rev.loc[(df_break['position']==-1) & (df_break['position'].shift(1)!=-1)]['Close'] short_exit = df_rev.loc[(df_break['position']!=-1) & (df_break['position'].shift(1)==-1)]['Close'] fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True) ax[0].plot(df_break['Close'], label='Close') ax[0].plot(df_break['UBB'], color=colors[2]) ax[0].plot(df_break['LBB'], color=colors[2]) ax[0].fill_between(df_break.index, df_break['UBB'], df_break['LBB'], alpha=0.3, color=colors[2], label='Bollinger Band') ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'Price and Bolling Bands for {ticker} (2018)')
ax[0].legend()

ax[0].scatter(long_entry.index, long_entry, c=colors[4],
s=100, marker='^', label='Long Entry',
zorder=10)
ax[0].scatter(long_exit.index, long_exit, c=colors[4],
s=100, marker='x', label='Long Exit',
zorder=10)
ax[0].scatter(short_entry.index, short_entry, c=colors[3],
s=100, marker='^', label='Short Entry',
zorder=10)
ax[0].scatter(short_exit.index, short_exit, c=colors[3],
s=100, marker='x', label='Short Exit',
zorder=10)
ax[0].set_ylim([5, 35])

ax[1].plot(df_break['strat_cum_returns'] * 100, label='Breakout', c=colors[1])
ax[1].set_xlabel('Date')
ax[1].set_ylabel('Returns (%)')
ax[1].set_title('Cumulative Returns for Breakout Strategy')
ax[1].legend()

ax[1].set_xlim([pd.to_datetime('2018-01-01'), pd.to_datetime('2019-01-01')])

plt.show()


In the plot above, we see that 2018 was just about a perfect year for this model. As the underlying security moved up 3x in price, our model was on the right side of nearly every trade and was able to 3x its equity from the low to the peak. It gave back a little from the peak in late October when a short position moved against it and took away a little bit of the profit.

Then, the drawdown happened.

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)

ax[0].plot(df_break['Close'], label='Close')
ax[0].plot(df_break['UBB'], color=colors[2])
ax[0].plot(df_break['LBB'], color=colors[2])
ax[0].fill_between(df_break.index, df_break['UBB'], df_break['LBB'],
alpha=0.3, color=colors[2], label='Bollinger Band')
ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and Bolling Bands for {ticker} (2019-2020)') ax[0].legend() ax[0].scatter(long_entry.index, long_entry, c=colors[4], s=100, marker='^', label='Long Entry', zorder=10) ax[0].scatter(long_exit.index, long_exit, c=colors[4], s=100, marker='x', label='Long Exit', zorder=10) ax[0].scatter(short_entry.index, short_entry, c=colors[3], s=100, marker='^', label='Short Entry', zorder=10) ax[0].scatter(short_exit.index, short_exit, c=colors[3], s=100, marker='x', label='Short Exit', zorder=10) ax[0].set_ylim([15, 60]) ax[1].plot(df_break['strat_cum_returns'] * 100, label='Breakout', c=colors[1]) ax[1].set_xlabel('Date') ax[1].set_ylabel('Returns (%)') ax[1].set_title('Cumulative Returns for Breakout Strategy') ax[1].legend() ax[1].set_xlim([pd.to_datetime('2019-01-01'), pd.to_datetime('2020-06-01')]) ax[1].set_ylim([3500, 10000]) plt.show()  In 2019, our model just couldn’t get anything going. It came down quite a ways as it kept losing with a series of false breakouts to the upside early in the year. About midway through the year, it switched and couldn’t catch a break losing money to the downside again and again. It got a pair of nice upward moves late in the year, but it wasn’t enough to make up for the losses it had sustained so far. When the COVID crash came, it consistently got caught on the wrong side by selling short after big down moves only to see the price reverse and being forced to close the position at a loss. Overall, the performance of this model was tremendous. But let’s see if we can do a bit better by adding another Bollinger Band to sell when the price moves too far before it reverses. ## Double Bollinger Band Breakout For this next strategy, we’re going to buy when the model breaks above the inner band which is set at 1\sigma, but sell if the price moves beyond the second band at 2\sigma. We want to capture the upside of the breakout model, but close out positions before they reverse on us. def DoubleBBBreakout(data, periods=20, m1=1, m2=2, shorts=True): ''' Buy/short when price moves outside of the inner band (m1). Exit when the price moves into the inner band or to the outer bound (m2). ''' assert m2 > m1, f'm2 must be greater than m1:\nm1={m1}\tm2={m2}' data = calcBollingerBand(data, periods, m1, label='m1') data = calcBollingerBand(data, periods, m2, label='m2') data['position'] = np.nan data['position'] = np.where(data['Close']>data['UBB_m1'], 1, 0) if shorts: data['position'] = np.where(data['Close']<data['LBB_m1'], -1, data['position']) data['position'] = np.where(data['Close']>data['UBB_m2'], 0, data['position']) data['position'] = np.where(data['Close']<data['LBB_m2'], 0, data['position']) data['position'] = data['position'].ffill().fillna(0) return calcReturns(data)  df_double = DoubleBBBreakout(data.copy()) fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True) ax[0].plot(df_double['Close'], label='Close', linewidth=0.5) ax[0].plot(df_double['UBB_m1'], color=colors[2], linewidth=0.5) ax[0].plot(df_double['LBB_m1'], color=colors[2], linewidth=0.5) ax[0].fill_between(df_double.index, df_double['UBB_m1'], df_double['LBB_m1'], alpha=0.3, color=colors[2], label='Inner Bollinger Band') ax[0].plot(df_double['UBB_m2'], color=colors[4], linewidth=0.5) ax[0].plot(df_double['LBB_m2'], color=colors[4], linewidth=0.5) ax[0].fill_between(df_double.index, df_double['UBB_m2'], df_double['LBB_m2'], alpha=0.3, color=colors[4], label='Outer Bollinger Band') ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'Price and Bolling Bands for {ticker}')
ax[0].legend()

ax[1].plot(df_double['cum_returns'] * 100, label='Buy and Hold')
ax[1].plot(df_double['strat_cum_returns'] * 100, label='Double Breakout')
ax[1].set_xlabel('Date')
ax[1].set_ylabel('Returns (%)')
ax[1].set_title('Cumulative Returns for Double Breakout Strategy and Buy and Hold')
ax[1].legend()

plt.show()

double_stats = getStratStats(df_double['strat_log_returns'])
df = pd.concat([df,
pd.DataFrame(double_stats, index=['Double Breakout'])])
df


As hoped, this model does reduce our volatility versus the buy and hold as well as the previous breakout model. However, we get a decrease in our total returns — still outperforming the buy and hold model and the mean reversion approaches. Surprisingly, we also reduce our Sortino Ratio versus the breakout model, but do increase our Sharpe Ratio.

In our intro article to Bollinger Bands, we mentioned a strategy recommended by John Bollinger that relies on the band width. This is calculated by taking the difference between the UBB and LBB and dividing by the SMA(TP).

As the width decreases, volatility decreases. Bollinger states that when this occurs, we can expect to see an increase volatility following a low in the band width. Unfortunately, we don’t have any directional indicators associated with a low in volatility, so we need to combine this with something else to let us know whether we should be long or short.

I have no idea if this tendency occurs or could be a tradeable signal, so let’s put a strategy together that we can test. Whether this works or not here, does not prove or disprove Bollinger’s claim — we’re running a simple, vectorized backtest on a single security to demonstrate how these signals could be used in a more complete strategy. So don’t go crazy one way or the other, just take the results as a data point and do your own investigation (the same can be said for all of these simple backtests we’re running).

Anyway…to test this we’ll combine a low point in the Bollinger Band Width with an EMA cross over to get a directional signal. Why EMA? Well, it’s more responsive to recent price changes than SMA, for example, because of the heavier weighting given towards the last price. If we see an increase in volatility following a low (which is a silly way to put it because this is true by definition) that we can trade then we’d want to jump on it quickly and EMA is going to be more likely to pick that up.

To implement this strategy, we’ll need to use some code for calculating the EMA.

def _calcEMA(P, last_ema, N):
return (P - last_ema) * (2 / (N + 1)) + last_ema

def calcEMA(data, N):
# Initialize series
data['SMA_' + str(N)] = data['Close'].rolling(N).mean()
ema = np.zeros(len(data))
for i, _row in enumerate(data.iterrows()):
row = _row[1]
if i < N:
ema[i] += row['SMA_' + str(N)]
else:
ema[i] += _calcEMA(row['Close'], ema[i-1], N)
data['EMA_' + str(N)] = ema.copy()
return data


Next, we need a full definition for our strategy. We’ll use the standard Bollinger Band settings of 20 days and 2\sigma. We’ll be looking for 20-day lows in the Band Width and seeing if we get a short-term, 10-day EMA to move above a longer term 30-day EMA to go long. If we get a low in the Band Width and the short-term EMA to move below the long term EMA, we’ll go short. We exit a position whenever the short term EMA crosses back over the long term EMA.

Now to the code (with some additional doc strings for clarity):

def BBWidthEMACross(data, periods=20, m=2, N=20, EMA1=10, EMA2=30, shorts=True):
'''
Buys when Band Width reaches 20-day low and EMA1 > EMA2.
Shorts when Band Width reaches 20-day low and EMA1 < EMA2.
Exits position when EMA reverses.
:periods: number of periods for Bollinger Band calculation.
:m: number of standard deviations for Bollinger Band.
:N: number of periods used to find a low.
:EMA1: number of periods used in the short-term EMA signal.
:EMA2: number of periods used in the long-term EMA signal.
:shorts: boolean value to indicate whether or not shorts are allowed.
'''
assert EMA1 < EMA2, f"EMA1 must be less than EMA2."
# Calculate indicators
data = calcBollingerBand(data, periods, m)
data['width'] = (data['UBB'] - data['LBB']) / data['TP_SMA']
data['min_width'] = data['width'].rolling(N).min()
data = calcEMA(data, EMA1)
data = calcEMA(data, EMA2)

data['position'] = np.nan
data['position'] = np.where(
(data['width']==data['min_width']) &
(data[f'EMA_{EMA1}']>data[f'EMA_{EMA2}']), 1, 0)
if shorts:
data['position'] = np.where(
(data['width']==data['min_width']) &
(data[f'EMA_{EMA1}']<data[f'EMA_{EMA2}']), -1,
data['position'])
data['position'] = data['position'].ffill().fillna(0)

return calcReturns(data)

df_bw_ema = BBWidthEMACross(data.copy())

bw_mins = df_bw_ema.loc[df_bw_ema['width']==df_bw_ema['min_width']]['width']

fig, ax = plt.subplots(3, figsize=(20, 12), sharex=True)
ax[0].plot(df_bw_ema['Close'], label='Close')
ax[0].plot(df_bw_ema['EMA_10'], label='EMA-10')
ax[0].plot(df_bw_ema['EMA_30'], label='EMA-30')
ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and EMAs for {ticker}') ax[0].legend() ax[1].plot(df_bw_ema['width'], label='Band Width') ax[1].scatter(bw_mins.index, bw_mins, s=100, marker='o', c=colors[1], label='20-Day Minima') ax[1].set_ylabel('Bollinger Band Width') ax[1].set_title('Bollinger Band Width and Local Minima') ax[1].legend() ax[2].plot(df_bw_ema['cum_returns'] * 100, label='Buy and Hold') ax[2].plot(df_bw_ema['strat_cum_returns'] * 100, label='Strat Rets') ax[2].set_xlabel('Date') ax[2].set_ylabel('Returns (%)') ax[2].set_title('Cumulative Returns for Band Width EMA and Buy and Hold') ax[2].legend() plt.show() bw_ema_stats = pd.DataFrame(getStratStats(df_bw_ema['strat_log_returns']), index=['Band Width and EMA']) df = pd.concat([df, bw_ema_stats]) df  This strategy loses most of its starting capital over the time horizon and winds up under performing most other strategies. One of the issues we see is that the 20-day min might not be discriminatory enough. For example, there are times when the Band Width rises significantly (e.g. 2003, 2008, 2009) and so a 20-day minimum winds up being rather elevated. We could update this to have an additional threshold in place before putting the trade on such as 20-day minimum and width < 0.2, for example. We could also extend the look-back period to be 30, 60, or more days, which would help us avoid buying in the midst of those high volatility periods. This is all assuming that Bollinger is correct about buying in low Band Width periods to begin with, of course. # What should you trade? There are heaps of other ways you can use Bollinger Bands in your trading. You can combine it with other indicators and methods to build better (or worse) strategies than shown above. One thing to keep in mind — don’t trade based on a single, vectorized backtest like we showed here. These are designed to give you a feel for how these models work, but there are too many simplifications, no diversity in instruments, no money management, no transaction costs — nothing that you need to develop a real strategy. All those features are difficult to implement. However, we are developing a complete, no-code trading system to allow you to build, test, and deploy your own algorithmic trading systems. Drop your email address below to get updates and a chance to be on our pre-release list! ## Beginner’s Guide to Trading the Stochastic RSI The stochastic RSI (StochRSI) is regularly used to spot areas where a security’s price has extended too far one way or the other. This makes it ripe for use in a mean-reversion strategy where you buy low and sell high or short it if it get’s too high with the hope the price drops. Like the RSI – which it is derived from – it can also be used to spot momentum trends. The indicator is an oscillator which moves from 0-100 with 50 being the centerline. Following this interpretation, we can use moves above 50 to spot upward momentum and moves below 50 to trade the short side. We’ll walk through both of these approaches with backtests in Python for you to replicate on your own. If you’re not familiar with the StochRSI, you can find all the details on calculating the indicator here with clear explanations, the math, and code examples. ## Testing a Mean-Reverting Stochastic RSI Strategy The most common StochRSI strategy is based on mean-reversion. Like the RSI, the StochRSI often uses 80 to indicate overbought levels to short, and 20 to indicate oversold levels to buy. Additionally, a 14 day lookback and smoothing period is common. For our purposes here, we’ll stick with these standard values. Now, getting to the code, let’s import a few standard packages in Python. import numpy as np import pandas as pd import matplotlib.pyplot as plt import yfinance as yf  Next, we’re going to build a function to calculate our indicator (we’re going to skip the discussion, you can find all the details here). We’ll call it calcStochRSI() and it will rely on a few functions to calculate the RSI and the stochastic oscillator to get our indicator of choice. def calcRSI(data, P=14): # Calculate gains and losses data['diff_close'] = data['Close'] - data['Close'].shift(1) data['gain'] = np.where(data['diff_close']>0, data['diff_close'], 0) data['loss'] = np.where(data['diff_close']<0, np.abs(data['diff_close']), 0) # Get initial values data[['init_avg_gain', 'init_avg_loss']] = data[ ['gain', 'loss']].rolling(P).mean() # Calculate smoothed avg gains and losses for all t > P avg_gain = np.zeros(len(data)) avg_loss = np.zeros(len(data)) for i, _row in enumerate(data.iterrows()): row = _row[1] if i < P - 1: last_row = row.copy() continue elif i == P-1: avg_gain[i] += row['init_avg_gain'] avg_loss[i] += row['init_avg_loss'] else: avg_gain[i] += ((P - 1) * avg_gain[i-1] + row['gain']) / P avg_loss[i] += ((P - 1) * avg_loss[i-1] + row['loss']) / P last_row = row.copy() data['avg_gain'] = avg_gain data['avg_loss'] = avg_loss # Calculate RS and RSI data['RS'] = data['avg_gain'] / data['avg_loss'] data['RSI'] = 100 - 100 / (1 + data['RS']) return data def calcStochOscillator(data, N=14): data['low_N'] = data['RSI'].rolling(N).min() data['high_N'] = data['RSI'].rolling(N).max() data['StochRSI'] = 100 * (data['RSI'] - data['low_N']) / \ (data['high_N'] - data['low_N']) return data def calcStochRSI(data, P=14, N=14): data = calcRSI(data, P) data = calcStochOscillator(data, N) return data def calcReturns(df): # Helper function to avoid repeating too much code df['returns'] = df['Close'] / df['Close'].shift(1) df['log_returns'] = np.log(df['returns']) df['strat_returns'] = df['position'].shift(1) * df['returns'] df['strat_log_returns'] = df['position'].shift(1) * df['log_returns'] df['cum_returns'] = np.exp(df['log_returns'].cumsum()) - 1 df['strat_cum_returns'] = np.exp(df['strat_log_returns'].cumsum()) - 1 df['peak'] = df['cum_returns'].cummax() df['strat_peak'] = df['strat_cum_returns'].cummax() return df  With these functions we just need to build the logic for our strategies and we’re set. Notice too that we have a helper function called calcReturns which we can quickly apply to the result of our backtest to get all of our return values from. This mean reversion model is going to short or sell (depending on your preference) when the StochRSI moves above 80, and buy when it’s below 20. def StochRSIReversionStrategy(data, P=14, N=14, short_level=80, buy_level=20, shorts=True): '''Buys when the StochRSI is oversold and sells when it's overbought''' df = calcStochRSI(data, P, N) df['position'] = np.nan df['position'] = np.where(df['StochRSI']<buy_level, 1, df['position']) if shorts: df['position'] = np.where(df['StochRSI']>short_level, -1, df['position']) else: df['position'] = np.where(df['StochRSI']>short_level, 0, df['position']) df['position'] = df['position'].ffill().fillna(0) return calcReturns(df)  table = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies') df = table[0] syms = df['Symbol'] # Sample symbols # ticker = np.random.choice(syms.values) ticker = "BSX" print(f"Ticker Symbol: {ticker}") start = '2000-01-01' end = '2020-12-31' # Get Data yfObj = yf.Ticker(ticker) data = yfObj.history(start=start, end=end) data.drop(['Open', 'High', 'Low', 'Volume', 'Dividends', 'Stock Splits'], inplace=True, axis=1) # Run test df_rev = StochRSIReversionStrategy(data.copy()) # Plot results colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] fig, ax = plt.subplots(2, figsize=(12, 8)) ax[0].plot(df_rev['strat_cum_returns']*100, label='Mean Reversion') ax[0].plot(df_rev['cum_returns']*100, label='Buy and Hold') ax[0].set_ylabel('Returns (%)') ax[0].set_title('Cumulative Returns for Mean Reversion and' + f' Buy and Hold Strategies for {ticker}') ax[0].legend(bbox_to_anchor=[1, 0.6]) ax[1].plot(df_rev['StochRSI'], label='StochRSI', linewidth=0.5) ax[1].plot(df_rev['RSI'], label='RSI', linewidth=1) ax[1].axhline(80, label='Over Bought', color=colors[1], linestyle=':') ax[1].axhline(20, label='Over Sold', color=colors[2], linestyle=':') ax[1].axhline(50, label='Centerline', color='k', linestyle=':') ax[1].set_ylabel('Stochastic RSI') ax[1].set_xlabel('Date') ax[1].set_title(f'Stochastic RSI for {ticker}') ax[1].legend(bbox_to_anchor=[1, 0.75]) plt.tight_layout() plt.show()  The mean reversion strategy crushes the buy and hold strategy for Boston Scientific outperforming it with a 28X return versus 2X over the 21-year period we examined. In the second plot, we show the StochRSI and our key levels. I also added the RSI for a comparison to the StochRSI which is much more volatile. This leads to frequent trading, which could severly impact your actual returns if you have a small account with relatively high transaction costs. We’re just running this on a single instrument, so we wind up with 443 trades, or trading every 12 days, which doesn’t seem like that much. However, if we were to manage a proper portfolio of instruments with this indicator and trade this frequently, we could be moving in and out of multiple trades per day and transaction costs become significant. Of course, this all dependent on your actual system and broker, so it may not be a major concern in your specific case. # Get trades diff = df_rev['position'].diff().dropna() trade_idx = diff.index[np.where(diff!=0)] fig, ax = plt.subplots(1, figsize=(12, 8)) ax.plot(df_rev['Close'], linewidth=1, label=f'{ticker}') ax.scatter(trade_idx, df_rev.loc[trade_idx]['Close'], c=colors[1], marker='^', label='Trade') ax.set_ylabel('Price') ax.set_title(f'{ticker} Price Chart and Trades for' + 'StochRSI Mean Reversion Strategy') ax.legend() plt.show()  To look at some of the key metrics for the overall strategy, let’s look use the following getStratStats function. def getStratStats(log_returns: pd.Series, risk_free_rate: float = 0.02): stats = {} # Total Returns stats['tot_returns'] = np.exp(log_returns.sum()) - 1 # Mean Annual Returns stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1 # Annual Volatility stats['annual_volatility'] = log_returns.std() * np.sqrt(252) # Sortino Ratio annualized_downside = log_returns.loc[log_returns<0].std() * np.sqrt(252) stats['sortino_ratio'] = (stats['annual_returns'] - risk_free_rate) \ / annualized_downside # Sharpe Ratio stats['sharpe_ratio'] = (stats['annual_returns'] - risk_free_rate) \ / stats['annual_volatility'] # Max Drawdown cum_returns = log_returns.cumsum() - 1 peak = cum_returns.cummax() drawdown = peak - cum_returns stats['max_drawdown'] = drawdown.max() # Max Drawdown Duration strat_dd = drawdown[drawdown==0] strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1] strat_dd_days = strat_dd_diff.map(lambda x: x.days).values strat_dd_days = np.hstack([strat_dd_days, (drawdown.index[-1] - strat_dd.index[-1]).days]) stats['max_drawdown_duration'] = strat_dd_days.max() return stats  rev_stats = getStratStats(df_rev['strat_log_returns']) bh_stats = getStratStats(df_rev['log_returns']) pd.concat([pd.DataFrame(rev_stats, index=['Mean Reversion']), pd.DataFrame(bh_stats, index=['Buy and Hold'])])  Here we see that 28X return on this strategy with roughly the same annual volatility of the underlying. Additionally, we have much, much better risk adjusted returns as measured by the Sortino and Sharpe Ratios. We do see one of the potential issues of mean reversion strategies in the COVID crash of 2020. The total returns of the strategy were dramatically cut because the strategy was positioned for a reversion upwards, but the market continued to tank and the model just held on. It recovered some of this, but never approached its pre-COVID highs in this test. Proper use of stops can help limit these large losses and potentially increase overall returns. ## Stochastic RSI and Momentum The other base strategy we had mentioned before is using StochRSI as a momentum indicator. When the indicator crosses the centerline we either buy or short the stock based on its direction. def StochRSIMomentumStrategy(data, P=14, N=14, centerline=50, shorts=True): ''' Buys when the StochRSI moves above the centerline, sells when it moves below ''' df = calcStochRSI(data, P, N) df['position'] = np.nan df['position'] = np.where(df['StochRSI']>50, 1, df['position']) if shorts: df['position'] = np.where(df['StochRSI']<50, -1, df['position']) else: df['position'] = np.where(df['StochRSI']<50, 0, df['position']) df['position'] = df['position'].ffill().fillna(0) return calcReturns(df)  Running our backtest: # Run test df_mom = StochRSIMomentumStrategy(data.copy()) # Plot results colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] fig, ax = plt.subplots(2, figsize=(12, 8)) ax[0].plot(df_mom['strat_cum_returns']*100, label='Momentum') ax[0].plot(df_mom['cum_returns']*100, label='Buy and Hold') ax[0].set_ylabel('Returns (%)') ax[0].set_title('Cumulative Returns for Momentum and' + f' Buy and Hold Strategies for {ticker}') ax[0].legend(bbox_to_anchor=[1, 0.6]) ax[1].plot(df_mom['StochRSI'], label='StochRSI', linewidth=0.5) ax[1].plot(df_mom['RSI'], label='RSI', linewidth=1) ax[1].axhline(50, label='Centerline', color='k', linestyle=':') ax[1].set_ylabel('Stochastic RSI') ax[1].set_xlabel('Date') ax[1].set_title(f'Stochastic RSI for {ticker}') ax[1].legend(bbox_to_anchor=[1, 0.75]) plt.tight_layout() plt.show()  In this case, our momentum strategy performed quite poorly, losing almost all of our initial investment over our hypothetical time period. Looking at our strategy’s stats, the only benefit this model has is a slightly shorter drawdown than the buy and hold approach. mom_stats = getStratStats(df_mom['strat_log_returns']) bh_stats = getStratStats(df_mom['log_returns']) pd.concat([pd.DataFrame(mom_stats, index=['Momentum']), pd.DataFrame(rev_stats, index=['Mean Reversion']), pd.DataFrame(bh_stats, index=['Buy and Hold'])])  This doesn’t mean that the Stochastic RSI is not suited for this type of application. It might work better with additional indicators or filters that could combine to improve our results. # Building a System Trading is hard. We shouldn’t expect a random indicator applied to a random stock to have great performance out of the box. A single, poor backtest doesn’t mean that the strategy is worthless. Conversely, a great backtest doesn’t mean you’ve got something you should start trading right away. There’s no magic formula, indicator, or system that you can use to be profitable in trading. It takes hard work and discipline to do well at this game. We are building the tools to help you become a better trader with complete backtests and portfolio management solutions. These tools can help make you a better trader with professional tests, a suite of risk metrics, proper position sizing, and real-time alerts when your custom system needs action. Sign up below to learn more. ## How to Build your First Mean Reversion Trading Strategy in Python The beautiful thing about markets is they always move. They may not show strong momentum or trends at any given time, but that doesn’t mean you can’t make money off of a strategy. You can ride the waves of the price action up and down — even on top of a larger trend — to make money and show healthy returns. # TL;DR We’ll build a mean reversion strategy that buys when a security reaches an extreme low, and sells when it reaches an extreme high. Properly tuned, this will allow us to extract returns from most any market regime. # Market Oscillations Have you ever looked back at a stock and found you held on too long? You should have sold at the top, but instead thought the stock had higher to go? Or, have you waited just a bit too long to get into a trade and missed out on the bottom? These kinds of mistakes are what oscillating or mean reversion strategies seek to address. They are designed to find those points where a market is over-extended one way or another so you can make your move. There are a whole host of oscillating indicators like this. For now, we’ll take a look at a basic model — our old simple moving average (SMA) — to serve as our indicator to buy, sell, and short a security. To begin, we’ll show how you can do this easily in Python. So start with importing some of the basic packages. import numpy as np import pandas as pd import matplotlib.pyplot as plt import yfinance as yf  To run this strategy, we’ll look at the SMA and see if the price is too high or low compared to the SMA. If the price is too low, we’ll buy it with the expectation that the price will go higher, towards the moving average. If it’s over extended to the upside, then we sell or go short, again with the expectation that the price is going to drop in the near term. def SMAMeanReversion(ticker, sma, threshold, shorts=False, start_date='2000-01-01', end_date='2020-12-31'): yfObj = yf.Ticker(ticker) data = yfObj.history(start=start_date, end=end_date) data['SMA'] = data['Close'].rolling(sma).mean() data['extension'] = (data['Close'] - data['SMA']) / data['SMA'] data['position'] = np.nan data['position'] = np.where(data['extension']<-threshold, 1, data['position']) if shorts: data['position'] = np.where( data['extension']>threshold, -1, data['position']) data['position'] = np.where(np.abs(data['extension'])<0.01, 0, data['position']) data['position'] = data['position'].ffill().fillna(0) # Calculate returns and statistics data['returns'] = data['Close'] / data['Close'].shift(1) data['log_returns'] = np.log(data['returns']) data['strat_returns'] = data['position'].shift(1) * \ data['returns'] data['strat_log_returns'] = data['position'].shift(1) * \ data['log_returns'] data['cum_returns'] = np.exp(data['log_returns'].cumsum()) data['strat_cum_returns'] = np.exp(data['strat_log_returns'].cumsum()) data['peak'] = data['cum_returns'].cummax() data['strat_peak'] = data['strat_cum_returns'].cummax() return data.dropna()  In the function above, we supply a ticker, our SMA period, the threshold, and whether or not we want to include short positions. With that info, we use yfinance to get the data and apply our moving average. We calculate our extension to see whether a movement is over/under extended by calculating the move as a percentage of our SMA indicator. We take a long position when we’re over extended to the low side, i.e. the price minus the SMA is less than the threshold, because we’re expecting the stock to mean revert and head higher. If it moves too far upward, we go short, or just stay neutral without a position. Because we expect things to revert to the mean, we will also exit our position if it comes very close to the mean (within 1% in this case). We can test this with some different tickers to see how it performs. We’ll use the getStratStats function we defined in a previous article to evaluate the results. ticker = 'AAL' SMA = 50 threshold = 0.1 shorts = False data = SMAMeanReversion(ticker, SMA, threshold, shorts) stats_dict = getStratStats(data) df_stats = pd.DataFrame(stats_dict).round(3) df_stats  Running this on American Airlines with a 50-day SMA and a 10% extension threshold, we outperform the buy and hold by a significant margin. We can also view the extension and the thresholds, along with our positions over time. colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] fig, ax = plt.subplots(3, figsize=(10, 8), sharex=True) long = data.loc[data['position']==1]['Close'] ax[0].plot(data['Close'], label='Price', linestyle=':', color=colors[1]) ax[0].plot(data['SMA'], label='SMA', linestyle='--', color=colors[0]) ax[0].scatter(long.index, long, label='Long', c=colors[2]) ax[0].legend(bbox_to_anchor=[1, 0.75]) ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'{ticker} Price and Positions with {SMA}-Day Moving Average')
ax[1].plot(data['extension']*100, label='Extension', color=colors[0])
ax[1].axhline(threshold*100, linestyle='--', color=colors[1])
ax[1].axhline(-threshold*100, label='Threshold', linestyle='--', color=colors[1])
ax[1].axhline(0, label='Neutral', linestyle=':', color='k')
ax[1].set_ylabel(f'Extension (%)')
ax[1].legend(bbox_to_anchor=[1, 0.75])
ax[2].plot(data['position'])
ax[2].set_xlabel('Date')
ax[2].set_title('Position')
ax[2].set_yticks([-1, 0, 1])
ax[2].set_yticklabels(['Short', 'Neutral', 'Long'])
plt.tight_layout()
plt.show()


Our long-term cumulative returns show that the strategy was much better the buy-and-hold baseline for this stock over the long run, even if it had periods of underperformance.

Unfortunately, this strategy is susceptible to losses during large, downward moves. This can be seen clearly in the plot above where the model takes a large hit during the 2008 crisis and the more recent 2020 COVID crash. The model winds up being on the exact wrong position in these times because it decides to go long in the face of increased selling.

To address this, we can include a safety threshold, a point at which the model has become too extended so that momentum dominates over mean reversion and we should exit the position.

def SMAMeanReversionSafety(ticker, sma, threshold,
safety_threshold=0.25, shorts=False,
start_date='2000-01-01', end_date='2020-12-31'):
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start_date, end=end_date)
data['SMA'] = data['Close'].rolling(sma).mean()
data['extension'] = (data['Close'] - data['SMA']) / data['SMA']

data['position'] = np.nan
data['position'] = np.where(
(data['extension']<-threshold) &
(data['extension']>-safety_threshold),
1, data['position'])

if shorts:
data['position'] = np.where(
(data['extension']>threshold) &
(data['extension']<safety_threshold),
-1, data['position'])

data['position'] = np.where(np.abs(data['extension'])<0.01,
0, data['position'])
data['position'] = data['position'].ffill().fillna(0)

# Calculate returns and statistics
data['returns'] = data['Close'] / data['Close'].shift(1)
data['log_returns'] = np.log(data['returns'])
data['strat_returns'] = data['position'].shift(1) * \
data['returns']
data['strat_log_returns'] = data['position'].shift(1) * \
data['log_returns']
data['cum_returns'] = np.exp(data['log_returns'].cumsum())
data['strat_cum_returns'] =
np.exp(data['strat_log_returns'].cumsum())
data['peak'] = data['cum_returns'].cummax()
data['strat_peak'] = data['strat_cum_returns'].cummax()

return data.dropna()


We can include this by updating the np.where arguments to include both the threshold and the safety_threshold variables. So if the extension value is between those bounds, we buy or go short, otherwise we’re neutral. Let’s see how these perform together.

ticker = 'AAL'
SMA = 50
threshold = 0.1
safety_threshold = 0.15
shorts = False
data = SMAMeanReversion(ticker, SMA, threshold, shorts)
data_safe = SMAMeanReversionSafety(ticker, SMA, threshold, safety_threshold, shorts)
safe_stats_dict = getStratStats(data_safe)
df_safe_stats = pd.DataFrame(safe_stats_dict).round(3)
fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(data_safe['strat_cum_returns'], label='Mean Reversion Strategy with Safety')
ax.plot(data['strat_cum_returns'], label='Mean Reversion Strategy')
ax.plot(data_safe['cum_returns'], label=f'{ticker}')
ax.set_xlabel('Date')
ax.set_ylabel('Returns (%)')
ax.set_title('Cumulative Returns for Mean Reversion and Buy and Hold Strategies')
ax.legend()
plt.show()


In this case, the safety threshold model outperforms the mean reversion strategy without the safety parameter nearly every step of the way. At the end, it’s able to cut its losses during big drawdowns like the COVID crash to maintain capital and leave the other strategies in the dust with a sideways moving stock.

Adding this safety feature not only boosted total returns, but reduced the annual volatility of the strategy, increased the Sharpe ratio, and made drawdowns less severe (exactly as we had hoped).

df_safe_stats.columns = ['Mean Reversion with Safety', 'Buy and Hold']
df_stats.columns = ['Mean Reversion', 'x']
df_stats = pd.concat([df_stats.T, df_safe_stats.T])
df_stats.drop('x', axis=0, inplace=True)
df_stats


Looking at 2020, you can see how this safety feature winds up avoiding the worst of the crash.

import calendar
ticks = [pd.to_datetime(f'2020-{i}-01') for i in np.arange(1, 13)]
cr_mr_safe = np.exp(data_safe.loc[data_safe.index>=ticks[0]]['strat_log_returns'].cumsum())
cr_mr = np.exp(data.loc[data.index>=ticks[0]]['strat_log_returns'].cumsum())
cr_base = np.exp(data.loc[data.index>=ticks[0]]['log_returns'].cumsum())
fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(cr_mr_safe, label='Mean Reversion Strategy with Safety')
ax.plot(cr_mr, label='Mean Reversion Strategy')
ax.plot(cr_base, label=f'{ticker}')
ax.set_xlabel('Date')
ax.set_ylabel('Returns (%)')
ax.set_title('Cumulative Returns for Mean Reversion and Buy and Hold Strategies')
ax.set_xlim([pd.to_datetime('2020-01-01'), data.index[-1]])
ax.set_xticks(ticks)
ax.set_xticklabels([i for i in calendar.month_abbr if i is not ''])
ax.legend()
plt.show()


Both mean reverting models go into the year neutral, however, the standard mean reversion model takes a long position when the crash begins and stays long until the bottom, then gets out for the bump that came in June. It catches up thanks to some later short positions, to end the year above the long-only strategy, but down for the year. The mean reversion with safety model doesn’t do much differently apart from miss that big drawdown. This capital preservation move allows it to have more to work with and wait for the bottom to hit before opening a position. This, of course, compounds throughout the year making it a much more effective strategy over for this stock.

# Bottom Line

Mean reversion strategies try to take into account the natural tendency of stocks to go back to a long run average. This is a great way to extract gains from choppy and sideways-moving markets.

The example above is a good start and gives you a feel for some of the important parameters with this model, but it is limited. It doesn’t take into account dividends, transaction fees, or run a proper, walk-forward optimization to choose parameters.

To get all of this with a hardened event-based backtest system, come check us out at Raposa. We’re building trading platforms for individuals to give them best-in-class experience and data. Sign up below to learn more.

Have you ever noticed the tendency for stocks that have risen to continue rising in the future? Likewise, a few down days seem to beget more losses. This is known as momentum and strategies that rely on these patterns are momentum-based strategies.

## TL;DR

We develop a basic momentum strategy and test it on GameStop to yield strong returns.

Traders and investors have long known about the effects of momentum and have found that these effects appear across a wide variety of markets and time frames. Running these strategies on a single instrument is also known as trend following or time series momentum. We’ll stick with the latter name, and abbreviate it TSM from here on out.

While there are a whole host of ways to run this strategy by combining it with a series of indicators, risk management overlays (a must for a real trading system!), and diversification, we’ll start simple by showing how it works on a single instrument.

To get started, we can turn to Python and some standard libraries.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf


From here, we can build our basic strategy function that we’ll call TSMStrategy. This will take the log returns of our time series, the period we’re interested in, and a boolean variable that determines whether we will allow short positions or not. The function will then simulate the strategy for us and return the cumulative sum of the performance for further analysis.

def TSMStrategy(returns, period=1, shorts=False):
if shorts:
position = returns.rolling(period).mean().map(
lambda x: -1 if x <= 0 else 1)
else:
position = returns.rolling(period).mean().map(
lambda x: 0 if x <= 0 else 1)
performance = position.shift(1) * returns
return performance


It’s a simple function for a simple (but powerful) strategy. The next step requires us to provide it with some data and see how it performs. I’m going to choose a stock that his been in the news a lot lately: GameStop (GME) to see how our model performs.

ticker = 'GME'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start='2000-01-01', end='2020-12-31')


Using the yfinance package, we can get the total public history of GME. It began trading in 2002, but setting the start date to 2000 will allow us to pick up the stock from the beginning without any errors.

To pass this to our strategy, we need to calculate the log returns and provide that to our function.

returns = np.log(data['Close'] / data['Close'].shift(1)).dropna()


The simplest TSM we can implement would require us to purchase the stock if it was up yesterday, and sell if it was down (if we’re holding it, otherwise we just wait). Let’s give it a shot.

performance = TSMStrategy(returns, period=1, shorts=False).dropna()
years = (performance.index.max() - performance.index.min()).days / 365
perf_cum = np.exp(performance.cumsum())
tot = perf_cum[-1] - 1
ann = perf_cum[-1] ** (1 / years) - 1
vol = performance.std() * np.sqrt(252)
rfr = 0.02
sharpe = (ann - rfr) / vol
print(f"1-day TSM Strategy yields:" +
f"\n\t{tot*100:.2f}% total returns" +
f"\n\t{ann*100:.2f}% annual returns" +
f"\n\t{sharpe:.2f} Sharpe Ratio")
gme_ret = np.exp(returns.cumsum())
b_tot = gme_ret[-1] - 1
b_ann = gme_ret[-1] ** (1 / years) - 1
b_vol = returns.std() * np.sqrt(252)
b_sharpe = (b_ann - rfr) / b_vol
f"\n\t{b_tot*100:.2f}% total returns" +
f"\n\t{b_ann*100:.2f}% annual returns" +
f"\n\t{b_sharpe:.2f} Sharpe Ratio")

1-day TSM Strategy yields:
225.03% total returns
6.44% annual returns
0.12 Sharpe Ratio
184.63% total returns
5.70% annual returns
0.07 Sharpe Ratio


The 1-Day TSM strategy beats the buy-and-hold with a reasonable annual return (ignoring transaction costs, which may be high given such a short term strategy). The 1-day lookback is likely fraught with a lot of false trends, so we can run a variety of different time periods to see how they stack up. We’ll just run our model in a loop with 3, 5, 15, 30, and 90-day time periods.

import matplotlib.gridspec as gridspec
periods = [3, 5, 15, 30, 90]
fig = plt.figure(figsize=(12, 10))
ax0.plot((np.exp(returns.cumsum()) - 1) * 100, label=ticker, linestyle='-')
perf_dict = {'tot_ret': {'buy_and_hold': (np.exp(returns.sum()) - 1)}}
for p in periods:
log_perf = TSMStrategy(returns, period=p, shorts=False)
perf = np.exp(log_perf.cumsum())
perf_dict['tot_ret'][p] = (perf[-1] - 1)
ann = (perf[-1] ** (1/years) - 1)
perf_dict['ann_ret'][p] = ann
vol = log_perf.std() * np.sqrt(252)
perf_dict['sharpe'][p] = (ann - rfr) / vol
ax0.plot((perf - 1) * 100, label=f'{p}-Day Mean')

ax0.set_ylabel('Returns (%)')
ax0.set_xlabel('Date')
ax0.set_title('Cumulative Returns')
ax0.grid()
ax0.legend()
_ = [ax1.bar(i, v * 100) for i, v in enumerate(perf_dict['ann_ret'].values())]
ax1.set_xticks([i for i, k in enumerate(perf_dict['ann_ret'])])
ax1.set_xticklabels([f'{k}-Day Mean'
if type(k) is int else ticker for
k in perf_dict['ann_ret'].keys()],
rotation=45)
ax1.grid()
ax1.set_ylabel('Returns (%)')
ax1.set_xlabel('Strategy')
ax1.set_title('Annual Returns')
_ = [ax2.bar(i, v) for i, v in enumerate(perf_dict['sharpe'].values())]
ax2.set_xticks([i for i, k in enumerate(perf_dict['sharpe'])])
ax2.set_xticklabels([f'{k}-Day Mean'
if type(k) is int else ticker for
k in perf_dict['sharpe'].keys()],
rotation=45)
ax2.grid()
ax2.set_ylabel('Sharpe Ratio')
ax2.set_xlabel('Strategy')
ax2.set_title('Sharpe Ratio')
plt.tight_layout()
plt.show()


Looking at the above, the 15-day momentum indicator gives us the best absolute and risk-adjusted returns. However, there is a lot of spread in these results. This indicates that we don’t have a very robust strategy (which should be no surprise). We could build on this basic strategy by incorporating other indicators, such as moving averages or exponentially weighted moving averages to indicate momentum. We could manage our risk better by incorporating stop losses or trailing stops to get out of trades closer to the top rather then when we have a down or flat 15-days. We could also allocate capital across multiple securities thereby benefiting from diversification and jumping on trends wherever they appear.

Some of these improvements require a more sophisticated, event-driven backtest system to properly simulate. This is something we’ve been working on for some time at Raposa Technologies and we’re making available to the public. To stay up to date with the latest developments and features as we roll out a professional quality, no-code backtesting system, subscribe below and be among the first to get access and updates to our newest offerings.

## Vectorized Simple Moving Average Strategy to beat the S&P 500

Anyone who follows the financial media has likely seen headlines like: The S&P 500 just formed a ‘Golden Cross,’ a bullish chart pattern with a solid track record or the A bearish ‘death cross’ has appeared in the Dow’s chart. These headlines refer to a simple moving average (SMA) indicator we can use to trade.

# TL;DR

We construct a simple moving average strategy in Python and backtest the results.

# Cross of Gold

Simple Moving Average (SMA) strategies are the bread and butter of algorithmic trading. At their most basic level, traders look at a short term moving price average and a longer term average (say, the 50-day and 200-day moving averages) and buy when the short term value is greater than the long term value. They’ll often sell or go short when that trend reverses. Amazingly, something as simple as this is able to garner a profitable return!

In its most basic application, a trader will select a single contract or security to track over time. They then calculate their averages over a given number of time periods — these can be days, weeks, months, minutes, seconds, whatever interval suits your trading speed and style — and make the comparison. We’re going to implement this on daily price charts and show how to use it to build a basic trading strategy in Python.

Let’s start by importing a few packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf


Above, we have some Python basics like numpypandas, and matplotlib. Additionally, we import yfinance, which is a package that gives us access to Yahoo! Finance data. The Yahoo! Finance API has changed a lot over the years, so a number of different packages have been deprecated and new ones have emerged to be able to offer consistent access to this free data source. We’ll use this to pull our data and set up our strategy.

To do so, select a ticker symbol and pass it to the Ticker method. I’m going to start at the beginning of the alphabet with ‘A’, which is Agilent Technologies, a healthcare company that has a long history for backtesting

ticker = "A"
yfObj = yf.Ticker(ticker)
data = yfObj.history(start="2000-01-01", end="2020-12-31")
name = yfObj.info['shortName']
plt.figure(figsize=(10, 5))
plt.plot(data['Close'])
plt.title(f'Price Chart for {name}')
plt.grid()
plt.show()


Looking at this price chart, we can see that Agilent had a huge run up in price during the Dot Com Bubble, then crashed hard and only reached its 2000–2001 peak 20 years later. A longterm, buy-and-hold investor who got in before the peak in early 2000 still would have waited over 15 years to recoup their investment (Agilent paid a regular, quarterly dividend starting in 2012, which would have changed this by a few quarters).

Let’s see if we can do better with a SMA strategy using standard, 50 and 200-day moving averages.

def SMABacktest(ticker, short_term_sma, long_term_sma,
shorts=False, start_date='2000-01-01', end_date='2020-12-31'):
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start_date, end=end_date)

data['SMA1'] = data['Close'].rolling(short_term_sma).mean()
data['SMA2'] = data['Close'].rolling(long_term_sma).mean()
if shorts:
data['position'] = np.where(
data['SMA1'] > data['SMA2'], 1, -1)
else:
data['position'] = np.where(
data['SMA1'] > data['SMA2'], 1, 0)

# Calculate returns
data['returns'] = data['Close'] / data['Close'].shift(1)
data['log_returns'] = np.log(data['returns'])
data['strat_returns'] = data['position'].shift(1) * \
data['returns']
data['strat_log_returns'] = data['position'].shift(1) * \
data['log_returns']
data['cum_returns'] = np.exp(data['log_returns'].cumsum())
data['strat_cum_returns'] = np.exp(
data['strat_log_returns'].cumsum())
data['peak'] = data['cum_returns'].cummax()
data['strat_peak'] = data['strat_cum_returns'].cummax()

return data


In the SMABacktest function above, we supply the stock ticker, get the data from Yahoo! Finance and then calculate the moving averages for each of our SMA indicators. We included a shorts option that allows the algorithm to go short if the SMA reverses, otherwise, it just exits the position and waits for the faster SMA to poke above the slower SMA before going long again.

We can plot the strategy as well as the cumulative returns below.

short_term_sma = 50
long_term_sma = 200
data = SMABacktest(ticker, short_term_sma, long_term_sma)
fig, ax = plt.subplots(2, figsize=(10, 5), sharex=True)
ax[0].plot(data['Close'], label=ticker)
ax[0].plot(data['SMA1'], label=f"{short_term_sma}-Day SMA")
ax[0].plot(data['SMA2'], label=f"{long_term_sma}-Day SMA")
ax[0].set_ylabel('Price (\$)')
ax[0].set_title(f'{ticker} Price with {short_term_sma}-Day SMA and {long_term_sma}-Day SMA')
ax[0].legend(bbox_to_anchor=[1, 0.75])
ax[0].grid()
ax[1].plot((data['strat_cum_returns'] - 1) * 100, label='SMA Strategy')
ax[1].plot((data['cum_returns'] - 1) * 100, label='Buy and Hold Strategy')
ax[1].set_ylabel('Returns (%)')
ax[1].set_xlabel('Date')
ax[1].set_title(f'Cumulative Returns for SMA and Buy and Hold Strategy')
ax[1].legend(bbox_to_anchor=[1.25, 0.75])
ax[1].grid()
plt.show()


Let’s also put another function together so that we can evaluate the performance of our model using some standard metrics.

def getStratStats(data, risk_free_rate=0.02):

# Total Returns
sma_strat['tot_returns'] = np.exp(data['strat_log_returns'].sum()) - 1

# Mean Annual Returns
sma_strat['annual_returns'] = np.exp(data['strat_log_returns'].mean() * 252) - 1
buy_hold_strat['annual_returns'] = np.exp(data['log_returns'].mean() * 252) - 1

# Annual Volatility
sma_strat['annual_volatility'] = data['strat_log_returns'].std() * np.sqrt(252)

# Sharpe Ratio
sma_strat['sharpe_ratio'] = (sma_strat['annual_returns'] - risk_free_rate) \
/ sma_strat['annual_volatility']

# Max Drawdown
_strat_dd = data['strat_peak'] - data['strat_cum_returns']
sma_strat['max_drawdown'] = _strat_dd.max()

# Max Drawdown Duration
strat_dd = _strat_dd[_strat_dd==0]
strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1]
strat_dd_days = strat_dd_diff.map(lambda x: x.days).values
strat_dd_days = np.hstack([strat_dd_days,
(_strat_dd.index[-1] - strat_dd.index[-1]).days])

sma_strat['max_drawdown_duration'] = strat_dd_days.max()

stats_dict = {'strat_stats': sma_strat,

return stats_dict


The getStratStats function above returns a number of important statistics to evaluate the strategies performance. It also compares the results to the buy-and-hold strategy as a baseline comparison to show how some simple trading can yield additional returns.

The stats we show above are fairly standard. We work off of the log returns and convert them to simple returns for easier interpretation. We also show drawdown stats, which are important when selecting a strategy that suits your personal temperament. This metric is a bit more involved to calculate than the others. It requires us to get the cumulative returns, and compare that with the peak returns up to that point in time. A new peak will equal the cumulative returns up to that point, so we can pull out new all-time highs by finding where the difference between the peak and cumulative returns are 0. To get the duration, we then look at the datetime indices of the new peaks and extract the days between them. It’s also important to get the time since the last peak in case the strategy is currently in a long drawdown.

Let’s see how this strategy performed.

stats_dict = getStratStats(data, risk_free_rate=0.02)
pd.DataFrame(stats_dict).round(3)


We see that the annualized returns are a healthy 6.7% with the SMA strategy versus 4.7% with buy and hold (again, ignoring dividends). The volatility for the SMA strategy is significantly lower than buy and hold, which can be seen in the plot above where the SMA strategy exits all positions and flatlines, waiting for a new golden cross to appear. Both strategies, however, suffer from long, drawdowns spanning multiple years. While the results look solid, would you be able to stay disciplined and stick while waiting over 2,400 days to reach a new high?

# What about an Index Fund?

Warren Buffett frequently advises the average investor to just buy an index fund and forget about it. How does this strategy compare with buying the SPY in 2000? Well, that’s easy enough to check!

spyObj = yf.Ticker('SPY')
spy_data = spyObj.history(start='2000-01-01', end='2020-12-31')
spy_ratio = spy_data['Close'][-1] / spy_data['Close'][0]
spy_ret = spy_ratio - 1
years = (spy_data.index[-1] - spy_data.index[0]).days / 365
spy_ann_ret = (spy_ratio) ** (1 / years) - 1
print(f'SPY Total Returns:\t{spy_ret*100:.2f}%' +
f'\nSPY Annual Returns:\t{spy_ann_ret*100:.2f}%')

print(f'SMA Total Returns:\t{stats_dict["strat_stats"]["tot_returns"]*100:.2f}%' +
f'\nSMA Annual Returns:\t{stats_dict["strat_stats"]["annual_returns"]*100:.2f}%')
SPY Total Returns:	279.02%
SPY Annual Returns:	6.55%
SMA Total Returns:	290.69%
SMA Annual Returns:	6.72%


Our golden cross strategy on a stock with a decade-long bear market outperforms buying and holding the S&P 500. The big reason, our model avoids the largest losses by sitting on the sideline and waiting for a signal.

We can run this strategy on the SPY itself too to see if we can outperform the tried-and-true long only index fund strategy. Additionally, we can add shorts into our strategy to further boos our returns.

sma_spy_data = SMABacktest('spy', short_term_sma,
long_term_sma, shorts=False)
sma_spy_data_shorts = SMABacktest('spy', short_term_sma,
long_term_sma, shorts=True)
fig, ax = plt.subplots(figsize=(10, 5), sharex=True)
ax.plot((sma_spy_data['strat_cum_returns'] - 1) * 100,
label='SMA SPY')
ax.plot((sma_spy_data_shorts['strat_cum_returns'] - 1) * 100,
label='SMA SPY with Shorts')
ax.plot((data['strat_cum_returns'] - 1) * 100,
label='SMA A')
ax.plot((sma_spy_data['cum_returns'] - 1) * 100,
ax.set_ylabel('Returns (%)')
ax.set_xlabel('Date')
ax.set_title(f'Cumulative Returns for SMA and Buy and Hold Strategy')
ax.legend(bbox_to_anchor=[1, 0.5])
ax.grid()
plt.show()

spy_stats = getStratStats(sma_spy_data, risk_free_rate=0.02)
spy_stats_shorts = getStratStats(sma_spy_data_shorts, risk_free_rate=0.02)
df0 = pd.DataFrame(stats_dict)
df0.columns = ['A SMA', 'A Buy and Hold']
df1 = pd.DataFrame(spy_stats)
df1.columns = ['SPY SMA', 'base_strat']
df2 = pd.DataFrame(spy_stats_shorts)
df2.columns = ['SPY SMA with Shorts', 'SPY Buy and Hold']
df = pd.concat([df0, df1, df2], axis=1)
df.drop('base_strat', axis=1, inplace=True)
df.round(3)


Looking at the values above, we can see that overlaying a SMA strategy helps boost both absolute returns and our risk adjusted returns above and beyond what we could get from most baselines.

There are a lot of ways we could go to develop this further. First, we just chose 50 and 200 day time periods for our moving averages. Those likely aren’t the best, leaving work on the table to optimize these parameters. Second, there’s no risk control in this strategy. This could be introduced by adding other markets, instruments, or volatility overlays to adjust position size to reduce the pain of losses and wind up with higher returns. There are no taxes or transaction costs taken into account here either. We also didn’t take into account dividends because we just ran this on standard, daily OHLC data rather than total return data or explicitly adjusting for dividend payments.