## How To Reduce Lag In A Moving Average

Moving average indicators are commonly used to give traders a general idea about the direction of the trend by smoothing the price series. One of the big drawbacks to most common moving averages is the lag with which they operate. A strong trend up or down may take a long time to get confirmation from the series leading to lost profit.

In 2005, Alan Hull devised the Hull Moving Average (HMA) to address this problem.

The calculation is relatively straightforward and can be done in 4 steps after choosing the number of periods, N, to use in the calculation:

1. Calculate the simple moving average over the past N periods.
• SMA1 = SMA(price, N)
2. Calculate the simple moving average over the past N/2 periods, rounded to the nearest whole value.
• SMA2 = SMA(price, int(N/2))
3. Multiply the shorter moving average by 2 and then subtract the first moving average from this.
• SMA_diff = 2 * SMA2 - SMA1
4. Take the moving average of this value over a period length equal to the square root of N, rounded to the nearest whole number.
• HMA = SMA(SMA_diff, int(sqrt(N)))

This winds up being more responsive to recent changes in price because we’re taking the most recent half of our data and multiplying it by 2. This provides an additional weighting on those values before we smooth things out again with the final moving average calculation. Confusingly, many blogs list each of these moving averages as weighted moving averages, but never specify the weights themselves. Don’t worry about that, all we have are a few simple moving averages which are weighted before being combined at the end.

For completeness, we can also write this out mathematically.

If we are calculating the SMA at time t over the last N periods, we’re going to call this SMA^N_t. For moving averages, we’re just getting a summation over the last N prices (we’ll use P for prices) and dividing by N like so:

SMA_t^N = \frac{1}{N}\sum_{i=1}^{N} P_{i-N}

SMA_t^M = \frac{1}{M}\sum_{i=1}^{M} P_{i-M}

HMA_t^H = \frac{1}{H} \sum_{i=1}^{H} (2SMA^M_t - SMA^N_t)

where the symbols M and H are N/2 and the square root of N rounded to the nearest integer values.

M = \bigg\lfloor \frac{N}{2} \bigg\rceil

H = \bigg\lfloor \sqrt{N} \bigg\rceil

Hopefully, that’s all pretty straightforward. Let’s get to some examples in Python to illustrate how this works.

## Hull Moving Average in Python

Like usual, let’s grab a few packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf


From here, we can write a function to calculate the HMA in just three lines of code, corresponding to the three equations we showed above.

def calcHullMA(price: pd.Series, N=50):
SMA1 = price.rolling(N).mean()
SMA2 = price.rolling(int(N/2)).mean()
return (2 * SMA2 - SMA1).rolling(int(np.sqrt(N))).mean()


We have our two moving averages, take the difference, and then smooth out the results with a third moving average. This function assumes we’re working with a Pandas data series and takes advantage of many of the methods that enables. Just be careful not to pass it a list or a NumPy array!

## Getting Some Data

Let’s illustrate how this works on some historical data. I’m just getting a year’s worth from a common stock, DAL.

ticker = 'DAL'
start = '2014-01-01'
end = '2015-01-01'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start, end=end)
data.drop(['Open', 'High', 'Low', 'Volume', 'Dividends', 'Stock Splits'],
axis=1, inplace=True)

# Applying our function
N = 50
data[f'HMA_{N}'] = calcHullMA(data['Close'], N)


Take a look to see how it behaves:

plt.figure(figsize=(12, 8))
plt.plot(data['Close'], label='Close')
plt.plot(data[f'HMA_{N}'], label='HMA')
plt.xlabel('Date')
plt.ylabel('Price ($)') plt.title(f'HMA and Price for {ticker}') plt.legend() plt.show()  As you can see, the HMA follows pretty closely. Of course, there is a lag as can be seen with some of the larger peaks and valleys over this time frame. Does it smooth well and with a lower lag than other moving averages as Hull intends? To find out, let’s compare it to a typical, simple moving average and an exponential moving average (EMA). Like the HMA, the EMA is designed to be more responsive to recent price changes. The code for the EMA calculation below was taken from a previous post you can dive into for further details. def _calcEMA(P, last_ema, N): return (P - last_ema) * (2 / (N + 1)) + last_ema def calcEMA(data: pd.DataFrame, N: int, key: str = 'Close'): # Initialize series sma = data[key].rolling(N).mean() ema = np.zeros(len(data)) + np.nan for i, _row in enumerate(data.iterrows()): row = _row[1] if np.isnan(ema[i-1]): ema[i] = sma[i] else: ema[i] = _calcEMA(row[key], ema[i-1], N) return ema  Plotting the results: data[f'EMA_{N}'] = calcEMA(data, N) data[f'SMA_{N}'] = data['Close'].rolling(N).mean() plt.figure(figsize=(12, 8)) plt.plot(data['Close'], label='Close', linewidth=0.5) plt.plot(data[f'HMA_{N}'], label='HMA') plt.plot(data[f'EMA_{N}'], label='EMA') plt.plot(data[f'SMA_{N}'], label='SMA') plt.xlabel('Date') plt.ylabel('Price ($)')
plt.title('Comparing 50-Day Moving Averages to Price')
plt.legend()
plt.show()


The plot looks pretty good. The HMA seems to track the price more closely than the other indicators while providing some good smoothing. However, we aren’t technical traders here at Raposa, so we need to do more than just look at a chart. We want to see the data!

To get an idea for the tracking error, we’re going to use the root mean square error (RMSE) to measure the difference between the indicator value and the price.

The RMSE is a common error metric that punishes deviations by squaring the error term. This means an error of 2 is 4 times greater than an error of 1! These squared errors all get summed up and then we take the square root of the values divided by the number of observations, n.

RMSE = \sqrt{\frac{\sum_t \big(\hat{P}_t - P_t \big)^2}{n}}

We’ll run our errors through a quick RMSE function we’ll write and see the results.

# Calculate tracking error
def calcRMSE(price, indicator):
sq_error = np.power(indicator - price, 2).sum()
n = len(indicator.dropna())
return np.sqrt(sq_error / n)

hma_error = calcRMSE(data['Close'], data[f'HMA_{N}'])
ema_error = calcRMSE(data['Close'], data[f'EMA_{N}'])
sma_error = calcRMSE(data['Close'], data[f'SMA_{N}'])

print('Lag Error')
print(f'\tHMA = \t{hma_error:.2f}')
print(f'\tEMA = \t{ema_error:.2f}')
print(f'\tSMA = \t{sma_error:.2f}')

Lag Error
HMA = 	1.65
EMA = 	1.24
SMA = 	1.53


Whoa! The HMA actually has greater error vs the price it’s tracking than the EMA and the SMA. This seems to cut against the intent of the HMA.

This is a small sample size, however, so maybe it really does have less lag than the other indicators and we just chose a bad stock and/or time frame.

Let’s test this by calculating the RMSE all of the stocks in the S&P 500 over the course of a year. Additionally, we’ll do this for different values of N to see if there’s any relationship between shorter or longer term values and the error.

Below, we have a helper function to calculate these values for us.

def calcErrors(data: pd.DataFrame, N: list):
hma_error, sma_error, ema_error = [], [], []
for n in N:
hma = calcHullMA(data['Close'], n)
ema = pd.Series(calcEMA(data, n), index=data.index)
sma = data['Close'].rolling(n).mean()
hma_error.append(calcRMSE(data['Close'], hma))
ema_error.append(calcRMSE(data['Close'], ema))
sma_error.append(calcRMSE(data['Close'], sma))

return hma_error, ema_error, sma_error


The calcErrors function takes our data and a list of time periods to calculate the HMA, EMA, and SMA. From there, we calculate the RMSE for each series versus our closing price and return lists of each.

Next, we’ll loop over all the stocks in the S&P 500 and get the data for each. We’ll pass this to our error calculation function and collect the errors for each symbol.

We’re relying on the list of stocks in Wikipedia, which doesn’t necessarily correspond to how the symbols are represented in yfinance (e.g. Berkshire Hathaway has two classes of shares A’s and B’s, which cause issues) so we need to wrap this in a try-except statement for those edge cases. We’ll still get enough that we should be able to get a decent estimate.

# Sample 10 tickers from S&P 500
url = 'https://en.wikipedia.org/wiki/List_of_S%26P_500_companies'
df = table[0]
syms = df['Symbol']
start = '2019-01-01'
end = '2020-01-01'
N = [5, 10, 15, 20, 30, 50, 100, 150, 200]
for i, s in enumerate(syms):
try:
yfObj = yf.Ticker(s)
data = yfObj.history(start=start, end=end)
except:
continue
he, ee, se = calcErrors(data, N)
if i == 0:
hma_error = np.array(he)
ema_error = np.array(ee)
sma_error = np.array(se)
else:
hma_error = np.vstack([hma_error, he])
ema_error = np.vstack([ema_error, ee])
sma_error = np.vstack([sma_error, se])

# Drop rows with missing values
hma_error = hma_error[~np.isnan(hma_error).any(axis=1)]
ema_error = ema_error[~np.isnan(ema_error).any(axis=1)]
sma_error = sma_error[~np.isnan(sma_error).any(axis=1)]


After a few minutes, we can take a look at the mean tracking error across all of our metrics and tickers below:

Here we see that the HMA does track the price much better than other moving average measurements. There’s much less difference in short-time frames, but the values do start to diverge from one another fairly quickly and become more pronounced over time.

## Trading with the Hull Moving Average

We could be more rigorous by tracking the deviation of our error measurements and getting more data, however for most purposes, it does seem as if the HMA does deliver on its promise to reducing lag. How do you trade it though?

The nice thing about the HMA, is that you can use it anywhere you’d use a moving average of any variety. You could build a whole new strategy around it, or just plug it into an existing system to see if you get any boost in your results.

We make all of that as easy as possible at Raposa, where we’re building a platform to allow you to backtest your ideas in seconds, without writing a single line of code. You can check out our free demo here!

## How to Trade the MACD: Four Strategies with Backtests

The Moving Average Convergence-Divergence (MACD) is a popular and versatile indicator that appears in a number of trading systems. In it’s most basic form, we have the difference between two exponential moving averages (EMA), one fast and the other slow. The MACD is the difference between these two EMAs. From this simple beginning a host of other indicators such as signal lines and MACD bars are built. We’ll show you how to implement each of these, but refer back to this article for a thorough explanation.

In this post, we’re going to implement and backtest three versions of the MACD in Python to illustrate how to trade it and discuss areas for improvement.

Of course, if you want to skip all the code, you can try our new platform for free here to test these ideas.

# Mean-Reverting MACD

A good place to get started with the MACD is to see when the signal diverges, i.e. when the MACD moves far away from 0. A mean-reverting strategy can quickly be built on top of this.

Our strategy is going to use standard parameters for the MACD. Our fast EMA will look over the past 12 days, and the slow EMA will run over 26 days. The model is going to buy our stock when the price breaks below a certain threshold, and will sell when the MACD converges back to 0. If the MACD runs high, we’ll short the stock and sell when it gets back to 0. We’re simply trying to jump on large, outlier movements in the price with the hope that the price will move back towards the longer EMA.

To get going, fire up Python and import the following packages.

import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt


To start, we need to calculate the MACD. As stated above, the MACD is built on top of the EMA, so we’re going to write a few functions to calculate the EMA, one to plug in the values, the other that will initialize it and apply it to our data frame. From there, we can write our MACD function that will take these parameters, calculate the EMAs, our MACD, and fill in our data frame.

The code for the following functions was taken from this, previous post on the MACD and will go through all of the details on the code and calculations.

def _calcEMA(P, last_ema, N):
return (P - last_ema) * (2 / (N + 1)) + last_ema

def calcEMA(data: pd.DataFrame, N: int, key: str = 'Close'):
# Initialize series
data['SMA_' + str(N)] = data[key].rolling(N).mean()
ema = np.zeros(len(data)) + np.nan
for i, _row in enumerate(data.iterrows()):
row = _row[1]
if np.isnan(ema[i-1]):
ema[i] = row['SMA_' + str(N)]
else:
ema[i] = _calcEMA(row[key], ema[i-1], N)

data['EMA_' + str(N)] = ema.copy()
return data

def calcMACD(data: pd.DataFrame, N_fast: int, N_slow: int):
assert N_fast < N_slow,
("Fast EMA must be less than slow EMA parameter.")
data = calcEMA(data, N_fast)
data = calcEMA(data, N_slow)
# Subtract values to get MACD
data['MACD'] = data[f'EMA_{N_fast}'] - data[f'EMA_{N_slow}']
# Drop extra columns
data.drop(['Open', 'High', 'Low', 'Volume',
'Dividends', 'Stock Splits'], axis=1, inplace=True)
return data


Now with the MACD in place, we’re going to write two more functions. One is going to be our strategy function, and the other will calculate all of our returns. We’re breaking this second one into its own function because we can re-use the code later when we write other MACD strategies.

The MACDReversionStrategy is where our strategy is implemented (if you didn’t guess that by the name). This is a simple, vectorized implementation that is just going to look for our level value, and trade if the MACD moves outside of its bounds.

def calcReturns(df):
df['returns'] = df['Close'] / df['Close'].shift(1)
df['log_returns'] = np.log(df['returns'])
df['strat_returns'] = df['position'].shift(1) * df['returns']
df['strat_log_returns'] = df['position'].shift(1) * \
df['log_returns']
df['cum_returns'] = np.exp(df['log_returns'].cumsum()) - 1
df['strat_cum_returns'] = np.exp(
df['strat_log_returns'].cumsum()) - 1

return df

def MACDReversionStrategy(data, N_fast=12, N_slow=26,
level=1, shorts=True):
df = calcMACD(data, N_fast, N_slow)
# Drop extra columns
df.drop(['Open', 'High', 'Low', 'Volume',
'Dividends', 'Stock Splits'], axis=1, inplace=True)
df['position'] = np.nan
df['position'] = np.where(df['MACD']<level, 1, df['position'])
if shorts:
df['position'] = np.where(df['MACD']>level, -1, df['position'])

df['position'] = np.where(df['MACD'].shift(1)/df['MACD']<0,
0, df['position'])
df['position'] = df['position'].ffill().fillna(0)

return calcReturns(df)


The next step is to get data. We’ll rely on the yfinance package to get some free data from Yahoo! Finance. For backtests, more data is better, so we’re going to grab an older company so we can see how this performs over a long, multi-decade time horizon, so let’s choose 3M (ticker: MMM) and see how this strategy works.

ticker = 'MMM'
start = '2000-01-01'
end = '2020-12-31'
yfObj = yf.Ticker(ticker)
df = yfObj.history(start=start, end=end)
N_fast = 12
N_slow = 26
df_reversion = MACDReversionStrategy(df.copy(),
N_fast=N_fast, N_slow=N_slow, level=1)
fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
ax[0].plot(df_reversion['strat_cum_returns']*100,
label='MACD Reversion')
ax[0].plot(df_reversion['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
'Cumulative Returns for MACD Reversion Strategy' +
f'and Buy and Hold for {ticker}')
ax[0].legend()
ax[1].plot(df_reversion['MACD'])
ax[1].axhline(level, label='Short Level', color=colors[1],
linestyle='--')
ax[1].axhline(-level, label='Long Level', color=colors[1],
linestyle='--')
ax[1].axhline(0, label='Neutral', color='k', linestyle='--')
ax[1].set_xlabel('Date')
ax[1].set_ylabel('MACD')
ax[1].set_title(f'{N_fast}/{N_slow} MACD for {ticker}')
plt.tight_layout()
plt.show()


Our MACD mean reversion strategy outpaces the underlying stock from 2000–2016, before the strategy begins giving back some value and then underperforms for a few years. Starting in 2018 and continuing through the end of 2020, however, the strategy simply takes off!

One thing to notice about this model, is that the range of values for the MACD increases over time. It’s likely the case that the signal value we chose (-1 and 1 for simplicity) aren’t optimal and an adaptive value may prove better. The drawback of this is that we now have another parameter we need to fit, which can lead to overfitting our results.

# MACD with Signal Line

This next method relies on the MACD signal line. This is just the EMA of the MACD itself. Because of this, we can re-use a lot of our code from the previous functions to add the signal line to our data.

def calcMACDSignal(data: pd.DataFrame, N_fast: int, N_slow: int,
N_sl: int = 9):
data = calcMACD(data, N_fast=N_fast, N_slow=N_slow)
data = calcEMA(data, N_sl, key='MACD')
# Rename columns
data.rename(
columns={f'SMA_{N_sl}': f'SMA_MACD_{N_sl}',
f'EMA_{N_sl}': f'SignalLine_{N_sl}'}, inplace=True)
return data


A common way to trade the MACD with the signal line, is to buy when the MACD is above the signal line, and short or sell if the lines cross again.

def MACDSignalStrategy(data, N_fast=12, N_slow=26, N_sl=9,
shorts=False):
df = calcMACDSignal(data, N_fast, N_slow, N_sl)
df['position'] = np.nan
df['position'] = np.where(df['MACD']>df[f'SignalLine_{N_sl}'], 1,
df['position'])

if shorts:
df['position'] = np.where(df['MACD']<df[f'SignalLine_{N_sl}'],
-1, df['position'])
else:
df['position'] = np.where(df['MACD']<df[f'SignalLine_{N_sl}'],
0, df['position'])
df['position'] = df['position'].ffill().fillna(0)
return calcReturns(df)


For the signal line, it’s fairly typical to look at the 9-day EMA, so that’s what we’ll use here. Again, no fiddling with the settings or optimization, we’re just taking a first pass at the backtest.

N_sl = 9
df_signal = MACDSignalStrategy(df.copy(), N_fast=N_fast,
N_slow=N_slow, N_sl=N_sl)

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
ax[0].plot(df_signal['strat_cum_returns']*100,
label='MACD Signal Strategy')
ax[0].plot(df_signal['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
f'Cumulative Returns for MACD Signal Strategy and' +
ax[0].legend()

ax[1].plot(df_signal['MACD'], label='MACD')
ax[1].plot(df_signal[f'SignalLine_{N_sl}'],
label='Signal Line', linestyle=':')
ax[1].set_xlabel('Date')
ax[1].set_ylabel('MACD')
ax[1].set_title(f'{N_fast}/{N_slow}/{N_sl} MACD for {ticker}')
ax[1].legend()

plt.tight_layout()
plt.show()


Here, our results weren’t nearly as great compared to the mean reversion model (the interplay between the signal line and the MACD is hard to discern at this time scale, zooming in will show its many crosses). This strategy does, however, avoid large or long drawdowns. It seems to be slow and steady, but does fail to win the race via a long-only strategy.

In a more complex strategy where you are managing a portfolio of strategies, an equity curve like we’re showing for the MACD signal strategy could be very valuable. The steady returns could provide a safe haven to allocate capital to if you’re high-return strategies suddenly underperform due to a shift in the market regime.

If you’re looking for all out returns, changing up some of the parameters could be a good way to go beyond the standard values we used here.

# MACD Momentum

Another technique that is frequently used is to take the difference between the MACD and the signal line. This is then plotted as vertical bars and called MACD bar or MACD histogram plots. If the bars grow over time, then we have increasing momentum and go long, if they decrease, then we have slowing momentum and a possible reversal.

Calculating this is rather straightforward, we just take our MACD and subtract off the signal line. Again, we can re-use a lot of our previous code to keep things short and sweet.

def calcMACDBars(data: pd.DataFrame, N_fast: int, N_slow: int,
N_sl: int = 9):
data = calcMACDSignal(data, N_fast=N_fast, N_slow=N_slow,
N_sl=N_sl)
data['MACD_bars'] = data['MACD'] - data[f'SignalLine_{N_sl}']
return data


The histogram is positive when the MACD is above the signal line, and negative when it falls below.

This is an indicator of an indicator and is a few steps removed from the actual price. Regardless, there are a variety of ways to trade using it. One way is to look for increasing or decreasing momentum. If consecutive, positive bars grow, it means that we have a bullish signal for increasing momentum as the MACD moves away from the signal line. If they’re moving closer to one another, then we have a bearish signal and can short it.

We can also look for crossovers, however this is the same as the signal line strategy we implemented above.

You’ll also see things such as peak-trough divergences or slant divergences to generate signals. These look for consecutive hills in the MACD histogram chart and are usually picked up via visual inspection.

We’re going to start with a momentum strategy that looks at the growth of consecutive bars. We’ll use the standard, 12/26/9 format for the MACD signal line parameters, but we’ll see if we can pick up on three consecutive days of growth (positive or negative) in the bars for buy/short signals.

def MACDMomentumStrategy(data, N_fast=12, N_slow=26, N_sl=9,
N_mom=3, shorts=False):
df = calcMACDBars(data, N_fast=N_fast, N_slow=N_slow, N_sl=N_sl)
df['growth'] = np.sign(df['MACD_bars'].diff(1))
df['momentum'] = df['growth'].rolling(N_mom).sum()
if shorts:
df['position'] = df['momentum'].map(
lambda x: np.sign(x) * 1 if np.abs(x) == N_mom else 0)
else:
df['position'] = df['momentum'].map(
lambda x: 1 if x == N_mom else 0)
df['position'] = df['position'].ffill().fillna(0)
return calcReturns(df)


Running and plotting:

N_mom = 3

df_mom = MACDMomentumStrategy(df.copy(), N_fast=N_fast,
N_slow=N_slow,N_sl=N_sl, N_mom=N_mom)

fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True)
ax[0].plot(df_mom['strat_cum_returns']*100, label='MACD Momentum')
ax[0].plot(df_mom['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(f'Cumulative Returns for MACD Signal Strategy and Buy and Hold for {ticker}')
ax[0].legend()
ax[1].plot(df_mom['MACD'], label='MACD')
ax[1].plot(df_mom[f'SignalLine_{N_sl}'], label='Signal Line',
linestyle=':')
ax[1].set_title(f'{N_fast}/{N_slow}/{N_sl} MACD for {ticker}')
ax[1].legend()
ax[2].bar(df_mom.index, df_mom['MACD_bars'], label='MACD Bars',
color=colors[4])
ax[2].set_xlabel('Date')
ax[2].set_title(f'MACD Bars for {ticker}')
plt.tight_layout()
plt.show()


This one doesn’t look so great. 20 years and losing 15% is not a strategy I want to be invested in. If we look a little deeper into this one, we can see that we jump in and out of a lot of positions very quickly. Usually we have some momentum, but because the momentum signal we’re looking at can change after a day or two, we aren’t in a position long enough to ride a trend. Essentially, we’re acting like a trend follower, but we bail so quickly we wind up taking a lot of small losses and have very few gains to show for it.

To alleviate this, let’s see if we can improve by using our MACD bars to signal momentum and only exit a position when the MACD bar crosses over. To see this, we’ll look for a sign change from positive to negative or vice versa.

def MACDSignalMomentumStrategy(data, N_fast=12, N_slow=26, N_sl=9,
N_mom=3, shorts=False):
df = calcMACDBars(data, N_fast=N_fast, N_slow=N_slow, N_sl=N_sl)
df['growth'] = np.sign(df['MACD_bars'].diff(1))
df['momentum'] = df['growth'].rolling(N_mom).sum()
# Enter a long/short position if momentum is going in the right
# direction and wait for cross-over
position = np.zeros(len(df)) + np.nan
for i, _row in enumerate(data.iterrows()):
row = _row[1]
mom = row['momentum']
if np.isnan(mom):
last_row = row.copy()
continue
if np.abs(mom) == N_mom and position[i-1] == 0:
# Enter new position
if shorts:
position[i] = np.sign(mom)
else:
position[i] = 1 if np.sign(mom) == 1 else 0
elif row['MACD_bars'] / last_row['MACD_bars'] < 0:
position[i] = 0
else:
# Hold position
position[i] = position[i-1]

df['position'] = position

return calcReturns(df)


Running this combined strategy:

df_sig_mom = MACDSignalMomentumStrategy(df.copy(), N_fast=N_fast,
N_slow=N_slow,N_sl=N_sl, N_mom=N_mom)
fig, ax = plt.subplots(3, figsize=(12, 8), sharex=True)
ax[0].plot(df_sig_mom['strat_cum_returns']*100,
label='MACD Momentum')
ax[0].plot(df_sig_mom['cum_returns']*100, label=f'{ticker}')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(f'Cumulative Returns for MACD Signal Strategy' +
f'and Buy and Hold for {ticker}')
ax[0].legend()
ax[1].plot(df_sig_mom['MACD'], label='MACD')
ax[1].plot(df_sig_mom[f'SignalLine_{N_sl}'], label='Signal Line',
linestyle=':')
ax[1].set_title(f'{N_fast}/{N_slow}/{N_sl} MACD for {ticker}')
ax[1].legend()
ax[2].bar(df_sig_mom.index, df_sig_mom['MACD_bars'],
label='MACD Bars', color=colors[4])
ax[2].set_xlabel('Date')
ax[2].set_title(f'MACD Bars for {ticker}')
plt.tight_layout()
plt.show()


This combined model does perform better, yielding us about 250% over this time frame. That is still about half of the long-only strategy that we’re trying to beat, but we could continue to tinker with these ideas to come up with something even better.

Before moving on, let’s take a look at some of the key metrics for each of these strategies.

## Comparing Strategies

Below, we use have a helper function to get our strategy statistics.

def getStratStats(log_returns: pd.Series,
risk_free_rate: float = 0.02):
stats = {}
# Total Returns
stats['tot_returns'] = np.exp(log_returns.sum()) - 1
# Mean Annual Returns
stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1

# Annual Volatility
stats['annual_volatility'] = log_returns.std() * np.sqrt(252)
# Sharpe Ratio
stats['sharpe_ratio'] = (
(stats['annual_returns'] - risk_free_rate)
/ stats['annual_volatility'])
# Max Drawdown
cum_returns = log_returns.cumsum()
peak = cum_returns.cummax()
drawdown = peak - cum_returns
stats['max_drawdown'] = drawdown.max()
# Max Drawdown Duration
strat_dd = drawdown[drawdown==0]
strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1]
strat_dd_days = strat_dd_diff.map(lambda x: x.days).values
strat_dd_days = np.hstack([strat_dd_days,
(drawdown.index[-1] - strat_dd.index[-1]).days])
stats['max_drawdown_duration'] = strat_dd_days.max()
return stats


Getting stats for each strategy:

stats_rev = getStratStats(df_reversion['strat_log_returns'])
stats_sig = getStratStats(df_signal['strat_log_returns'])
stats_mom = getStratStats(df_mom['strat_log_returns'])
stats_sig_mom = getStratStats(df_sig_mom['strat_log_returns'])
stats_base = getStratStats(df_reversion['log_returns'])
stats_dict = {'Mean Reversion': stats_rev,
'Signal Line': stats_sig,
'Momentum': stats_mom,
'Signal-Momentum': stats_sig_mom,
'Baseline': stats_base}
pd.DataFrame(stats_dict)


This baseline was pretty tough to beat. Over 9% annual returns despite some large drawdowns, however, the MACD mean reversion strategy beat it easily and with a better Sharpe Ratio to boot. The signal line and the momentum model both severely underperformed the baseline — with the MACD momentum model just being flat out atrocious. However, combining these two yielded better results, still below the baseline, but with a reasonable Sharpe Ratio and a lower drawdown (albeit longer than the baseline).

# No Guarantees

Each of these models could be tweaked and improved, but they’re only a small taste of the possibilities available to trade using the MACD and its derivatives. No matter how good that mean reversion strategy looks, nothing here should be blindly implemented.

While we ran a fine backtest, there are issues. Dividends weren’t taken into account, transaction costs were ignored, slippage, only one stock was examined, and so forth. If you really want to trade algorithmically, you’re better off with a strategy that manages a diversified portfolio, on data that has been adjusted for dividends, with transaction costs, over a long time horizon.

At Raposa, we’re building an engine that will enable you to quickly test ideas like these on a portfolio of securities, with proper money management, a plethora of signals, and high quality data, all without a single line of code. We have a state-of-the-art system that will allow you to customize your model to suit your needs, and deploy it when you’re ready so you can trade live. If you want to move beyond simple backtests and strategies, then sign up below to be kept up to date on our latest developments.

Bollinger Bands — first developed by John Bollinger in the early 1980’s — measure the volatility range of a security over time. They provide an envelop around the price and can be leveraged in a variety of trading strategies. Some use them on their own, but most frequently they’re combined with other indicators to confirm trends or signal reversals.

We’re going to walk through calculations with the math, pseudo-code, and examples in Python. On top of that, we have a few different trading ideas and ways these can be incorporated into your system.

## How to Calculate Bollinger Bands

The Bands require two parameters, N and m. N gives the number of periods we are going to use to calculate the standard deviations (STD or \sigma) and the simple moving average (SMA) used in to construct the Bands. m is a multiple that we apply to the standard deviations, so we’re going to set bands at m \sigma above and below the SMA. Most people use N=20 and m=2 for these settings. With these, we can calculate the Bollinger Bands in 4 simple steps:

1. Calculate the typical price (TP). Typical price is the average of the high, low, and close for the day.

TP[t] = (Close[t] + High[t] + Low[t]) / 3

2. Calculate the simple moving average of the typical price over the past N days (SMA(TP)).

SMA_TP[t] = sum(TP[-N:t]) / N

3. Calculate the sample standard deviation of the typical price for the past N days.

STD_TP[t] = sqrt(TP[t] - mean(TP))**2 / (N - 1)

4. Get the upper and lower Bands by adding and subtracting the standard deviation and the SMA(TP) values and multiplying by m.

UBB[t] = SMA_TP[t] + m * STD_TP[t]
LBB[t] = SMA_TP[t] - m * STD_TP[t]

Or, mathematically we can write:

TP_t = \frac{C_t + H_t + L_t}{3} SMA^{TP}_t = \frac{1}{N} \sum_{t=1}^N TP_{t-N} \sigma^{TP}_t = \sqrt{\frac{\sum (TP_t - \bar{TP})^2}{N-1}} UBB_t = SMA^{TP}_t + m \sigma^{TP}_t LBB_t = SMA^{TP}_t -m \sigma^{TP}_t

Let’s turn to providing the details in Python with an example.

## Calculating Bollinger Bands in Python

First, we’ll start with data. In this case, let’s play with MCD.

table = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
syms = df['Symbol']
# Sample symbols
ticker = np.random.choice(syms.values)
ticker = "MCD"

start = '2015-01-01'
end = '2016-12-31'

# Get Data
yfObj = yf.Ticker(ticker)
data = yfObj.history(start=start, end=end)
data.drop(['Open', 'Volume', 'Dividends',
'Stock Splits'], inplace=True, axis=1)


Now, we can implement our four steps in just a few lines of code.

N = 20
m = 2
data['TP'] = data.apply(
lambda x: np.mean(x[['High', 'Low', 'Close']]), axis=1)
data[f'SMA_{N}'] = data['TP'].rolling(N).mean()
data['STD'] = data['TP'].rolling(N).std(ddof=1)
data['UBB'] = data[f'SMA_{N}'] + m * data['STD']
data['LBB'] = data[f'SMA_{N}'] - m * data['STD']


Plotting the results:

plt.figure(figsize=(15, 10))
plt.plot(data['Close'], label='Price')
plt.plot(data['UBB'], label='UBB')
plt.plot(data['LBB'], label='LBB')
plt.xlabel('Date')
plt.ylabel('Price ($)') plt.title(f'Price and Bollinger Bands for {ticker}') plt.legend() plt.show()  The Bollinger Bands create a smooth envelope around most of the price action. There are a few cases where the price breaks outside of the envelope, which may indicate trading signals. In fact, this is the most straightforward way to trade this signal – simply buy it when the price moves below the lower band or short it when it moves above. This provides a simple mean reversion strategy. We do have the simple moving average of the TP (SMA(TP)) as well, which can be used like the centerline in an oscillator strategy like the RSI. We could close our position when the price reaches the SMA(TP), rather than wait for it to reach the other side of the Band. ## Following the Trend with Bollinger Bands Like many indicators, we can leverage the Bollinger Bands in a trend following strategy as well. Traders will often use two sets of Bands in conjunction with one another to identify trending price action. For example, we can add a 1 \sigma band and identify trends when the price is in between the 1 \sigma and 2 \sigma upper or lower bands. We’d do it like this: m1 = 1 m2 = 2 data[f'SMA_{N}'] = data['TP'].rolling(N).mean() data['STD'] = data['TP'].rolling(N).std(ddof=1) data[f'UBB_{m1}'] = data[f'SMA_{N}'] + m1 * data['STD'] data[f'LBB_{m1}'] = data[f'SMA_{N}'] - m1 * data['STD'] data[f'UBB_{m2}'] = data[f'SMA_{N}'] + m2 * data['STD'] data[f'LBB_{m2}'] = data[f'SMA_{N}'] - m2 * data['STD'] colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] plt.figure(figsize=(15, 10)) plt.plot(data['Close'], label='Price', zorder=100) plt.plot(data[f'UBB_{m1}'], c=colors[2]) plt.plot(data[f'LBB_{m1}'], c=colors[2]) plt.fill_between(data.index, data[f'UBB_{m1}'], data[f'LBB_{m1}'], color=colors[2], label='Neutral Zone', alpha=0.3) plt.plot(data[f'UBB_{m2}'], c=colors[1]) plt.plot(data[f'LBB_{m2}'], c=colors[4]) plt.fill_between(data.index, data[f'UBB_{m2}'], data[f'UBB_{m1}'], color=colors[1], label='Up-Trend', alpha=0.3) plt.fill_between(data.index, data[f'LBB_{m2}'], data[f'LBB_{m1}'], color=colors[4], label='Down-Trend', alpha=0.3) plt.xlabel('Date') plt.ylabel('Price ($)')
plt.title(f'Price and Bollinger Bands for {ticker}')

plt.legend()
plt.show()


You can see here that the price frequently stays within the neutral zone, but then breaks up or down and seems to keep a streak going. In the plot below, we zoom in on a quick price rise that exhibits this characteristic from mid-2015 to 2016.

plt.figure(figsize=(15, 10))
plt.plot(data['Close'], label='Price', marker='o',
zorder=100)
plt.plot(data[f'UBB_{m1}'], c=colors[2])
plt.plot(data[f'LBB_{m1}'], c=colors[2])
plt.fill_between(data.index, data[f'UBB_{m1}'], data[f'LBB_{m1}'],
color=colors[2], label='Neutral Zone', alpha=0.3)
plt.plot(data[f'UBB_{m2}'], c=colors[1])
plt.plot(data[f'LBB_{m2}'], c=colors[4])
plt.fill_between(data.index, data[f'UBB_{m2}'], data[f'UBB_{m1}'],
color=colors[1], label='Up-Trend', alpha=0.3)
plt.fill_between(data.index, data[f'LBB_{m2}'], data[f'LBB_{m1}'],
color=colors[4], label='Down-Trend', alpha=0.3)
plt.xlabel('Date')
plt.ylabel('Price ($)') plt.title(f'Price and Bollinger Bands for {ticker}') plt.xlim([pd.to_datetime('2015-08-01'), pd.to_datetime('2016-05-01')]) plt.legend() plt.show()  In this plot, we added the individual data points to more clearly see the precise closing prices from day to day. Zooming in, you can see that breaks above the 1\sigma upper band seem to be followed by a streak of days above, indicating times you’d be long, riding the trend as it increases. While less frequent and shorter, days below the neutral zone appear to persist, with the entry point often higher than the exit, indicating potentially profitable short opportunities. ## Tightening our Belt You’ll notice that the width of the bands does not remain constant over time, they expand and contract with volatility. We can use this expansion and contraction to derive another, Bollinger Band-based indicator called Band Width. This is calculated by subtracting the lower band from the upper and dividing by the SMA(TP). BW_N = \frac{UBB_N-LBB_N}{SMA_N^{TP}} We can implement that on our data with the following code: data['BW'] = (data['UBB'] - data['LBB']) / data[f'SMA_{N}'] fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True) ax[0].plot(data['Close'], label='Price') ax[0].plot(data['UBB'], c=colors[1]) ax[0].plot(data['LBB'], c=colors[1]) ax[0].fill_between(data.index, data['UBB'], data['LBB'], color=colors[1], alpha=0.3) ax[0].set_title(f'Price and Bollinger Bands for {ticker}') ax[0].set_ylabel('Price ($)')

ax[1].plot(data['BW'])
ax[1].set_ylabel('Band Width')
ax[1].set_xlabel('Date')
ax[1].set_title(f'Bollinger Band Width for {ticker}')

plt.tight_layout()
plt.show()


Typically this value is going to stay fairly low, e.g. less than 0.5 and almost always less than 1. There are times that this can really blow up, such as in the case of GameStop during this year’s epic short-squeeze as shown below.

ticker = 'GME'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start='2020-08-01', end='2021-07-01')

N = 20
m = 2
data['TP'] = data.apply(
lambda x: np.mean(x[['High', 'Low', 'Close']]), axis=1)
data[f'SMA_{N}'] = data['TP'].rolling(N).mean()
data['STD'] = data['TP'].rolling(N).std(ddof=1)
data['UBB'] = data[f'SMA_{N}'] + m * data['STD']
data['LBB'] = data[f'SMA_{N}'] - m * data['STD']
data['BW'] = (data['UBB'] - data['LBB']) / data[f'SMA_{N}']

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)
#
ax[0].plot(data['Close'], label='Price')
ax[0].plot(data['UBB'], c=colors[1])
ax[0].plot(data['LBB'], c=colors[1])
ax[0].fill_between(data.index, data['UBB'], data['LBB'],
color=colors[1], alpha=0.3)
ax[0].annotate('GME Squeeze Begins',
xy=(pd.to_datetime('2021-01-10'), 50),
xytext=(pd.to_datetime('2020-12-01'), 100),
arrowprops=dict(arrowstyle='->'))
ax[0].set_title(f'Price and Bollinger Bands for {ticker}')
ax[0].set_ylabel('Price ($)') ax[1].plot(data['BW']) ax[1].annotate('Band Width Blows Up', xy=(pd.to_datetime('2021-01-10'), 1), xytext=(pd.to_datetime('2020-12-01'), 2), arrowprops=dict(arrowstyle='->')) ax[1].set_title('Bollinger Band Width for GME') ax[1].set_xlabel('Date') ax[1].set_ylabel('Band Width') plt.tight_layout() plt.show()  Bollinger himself states that lows in this Band Width are often followed by breakouts. To test this, we can combine this indicator with a directional indicator or some other confirmation signal such as an oscillator or EMA to see if we can hit profitable trades. ## Test, and be Profitable! Of course, we’re just giving a verbal description of how these strategies could works with some illustrations. You’d have to run a proper backtest in order to see if there’s a profitable signal to be traded or not. We’re building complete backtest systems that will allow you to test your strategies, gauge your risk, and deploy in the markets – all with no code. Sign up below to join the wait list and learn more! ## How to add Exponential Moving Averages to Your Trading Arsenal Moving average indicators are used in a variety of trading strategies to spot long-term trends in the price data. One potential drawback of simple moving average strategies is that they weight all of the prices equally, whereas you might want more recent prices to take on a greater importance. The exponential moving average (EMA) is one way to accomplish this. ## TL;DR We walk through the EMA calculation with code examples and compare it to the SMA. ## Calculating the Exponential Moving Average The EMA gives more weight to the most recent prices via a weighting multiplier. This multiplier is applied to the last price so that it accounts for a larger chunk of the moving average than the other data points. The EMA gives more weight to the most recent prices via a weighting multiplier. This multiplier is applied to the last price so that it accounts for a larger chunk of the moving average than the other data points. The EMA is calculated by taking the most recent price (we’ll call it P_t, or “price at time t“) and subtracting the EMA from the previous time period (EMA_{t-1}). This difference is weighted by the number of time periods you set your EMA to (N) and added back to the EMA_{t-1}. Mathematically, we can write it like this: EMA_t = \big( P_t - EMA_{t-1} \big) \frac{2}{N+1} + EMA_{t-1} You may have noticed that the above equation has a slight problem, how does it get started? It’s referencing the last period’s EMA, so if you go to the first calculation, what is it referencing? This is usually alleviated by substituting the simple moving average (SMA) to initialize the calculation so that you can build the EMA for all time periods after the first. Let’s show how this works with a simple example in Python by importing our packages. import numpy as np import pandas as pd import yfinance as yf import matplotlib.pyplot as plt  From here, we will build two functions to work together and calculate our indicator. The first function will be a simple implementation of the formula we outlined above: def _calcEMA(P, last_ema, N): return (P - last_ema) * (2 / (N + 1)) + last_ema  The second function will calculate the EMA for all of our data, first by initializing it with the SMA, then iterating over our data to update each subsequent entry with the value in our SMA column or calling the _calcEMA function we defined above for rows greater than N. def calcEMA(data, N): # Initialize series data['SMA_' + str(N)] = data['Close'].rolling(N).mean() ema = np.zeros(len(data)) for i, _row in enumerate(data.iterrows()): row = _row[1] if i < N: ema[i] += row['SMA_' + str(N)] else: ema[i] += _calcEMA(row['Close'], ema[i-1], N) data['EMA_' + str(N)] = ema.copy() return data  Now, let’s get some data and see how this works. We’ll pull a shorter time period than we would use for a backtest and compare 10, 50, and 100 days of the EMA and SMA. ticker = 'GM' yfObj = yf.Ticker(ticker) data = yfObj.history(ticker, start='2018-01-01', end='2020-12-31') N = [10, 50, 100] _ = [calcEMA(data, n) for n in N] colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] fig, ax = plt.subplots(figsize=(18, 8)) ax.plot(data['Close'], label='Close') for i, n in enumerate(N, 1): ax.plot(data[f'EMA_{n}'], label=f'EMA-{n}', color=colors[i]) ax.plot(data[f'SMA_{n}'], label=f'SMA-{n}', color=colors[i], linestyle=':') ax.legend() ax.set_title(f'EMA and Closing Price Comparison for {ticker}') plt.show()  You can see in the plot above that the EMAs are more responsive to recent changes than the SMAs. Shorter time horizons too are even more responsive than longer time horizons which have a price “memory” that may stretch back months or more. Moving averages of all types are lagging indicators meaning they only tell you what has already happened in the price. However, this doesn’t mean they can’t be useful for identifying trends and developing strategies that use one or more moving average indicators. If you have an idea, go ahead and test it out, see how EMA, SMA, and other values can be combined to develop new and profitable trading strategies. At Raposa, you can quickly test your ideas with no code and high-quality data to run backtests. Find a strategy that works for you and deploy it to get live trading alerts. ## Higher Highs, Lower Lows, and Calculating Price Trends in Python What happens if the price direction disagrees with your model? For example, the price may be increasing, but your RSI – a derivative of price – is decreasing. Should you trade this if you get a signal? Or could you use this disagreement itself as a signal? Situations like this are referred to as divergences because the price and the indicator are moving in opposite directions. Typically you’ll see traders discuss price making a “higher high” while some indicator makes a “lower low.” In technical analysis, traders will call this a bearish divergence and they’ll forecast a price drop. The opposite situation indicates a bullish divergence, leading to a forecasted price rise. We don’t really care what the forecast is – we want to see if there are any statistical signals we can extract from this pattern. To that end, we need a formula to capture these movements. ## What makes a Peak? Having a computer spot a peak or trough is actually more difficult than it may seem. Take a look at the plot below. What would you call a “peak?” import numpy as np import matplotlib.pyplot as plt import pandas as pd import yfinance as yf ticker = 'F' yfObj = yf.Ticker(ticker) data = yfObj.history(start='2010-01-01', end='2010-07-01') plt.figure(figsize=(15, 8)) plt.plot(data['Close']) plt.title(f'Price Chart for {ticker}') plt.xlabel('Date') plt.ylabel('Price ($)')
plt.show()


We have local maxima and minima – points that are higher or lower than either of the points immediately to the left and right – scattered throughout the chart.

Clearly the peak – or the global maximum over this range – occurs on April 26th when the price closes at $9.80. But would you count the prices on April 5th or April 15th? They are local maxima, but does that make them peaks to mark a divergence? If not, why not? And when would you know? data['local_max'] = data['Close'][ (data['Close'].shift(1) < data['Close']) & (data['Close'].shift(-1) < data['Close'])] data['local_min'] = data['Close'][ (data['Close'].shift(1) > data['Close']) & (data['Close'].shift(-1) > data['Close'])] colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] plt.figure(figsize=(15, 8)) plt.plot(data['Close'], zorder=0) plt.scatter(data.index, data['local_max'], s=100, label='Maxima', marker='^', c=colors[1]) plt.scatter(data.index, data['local_min'], s=100, label='Minima', marker='v', c=colors[2]) plt.xlabel('Date') plt.ylabel('Price ($)')
plt.title(f'Local Maxima and Minima for {ticker}')
plt.legend()
plt.show()


We need to do some kind of filtering or develop some type of rules to identify peaks and troughs so we don’t end up with such a noisy signal like we have in the plot above. Additionally, we need to make sure we aren’t looking ahead in our data when doing so.

One of the biggest issues with backtesting (and it is a particularly pernicious problem when dealing with divergences) is lookahead bias. You need to be certain that your test does not take into account data that it would not have had at the time.

For example, if we’re trading Ford in April of 2010, and we get to a new high of $9.80 on April 26th, we don’t actually know if that is a peak or not. It happens to drop 6% the next day and continue downward for most of the next month – but from the perspective of a trader on the 26th, we have no idea what will happen next. Whatever rule we implement in our algorithmic strategy has to take this into account and cannot trade on the 26th because it’s a peak – that would be lookahead bias and would skew your results terribly. Further, we have one more complication, we aren’t just looking for new peaks and troughs, but a succession of peaks and troughs to make a divergence indicator. To extract “higher highs” out of a signal, we need at least two peaks with the second peak being higher than the first. ## Coding a Convergence/Divergence Indicator For our purposes, we can use the argrelextrema function from SciPy’s signal processing library. from scipy.signal import argrelextrema  This function will give us the max and min values from a time series. We simply need to pass our data, tell it whether we’re looking for maxima or minima values, and then indicate how many data points to either side we’re going to look. As shown above, we don’t necessarily want to get every local max/min, instead we can look in wider areas to pull out peaks for our divergence indicator. Take a look at the example below where we wait for 5 data points (order argument) to make our selection. max_idx = argrelextrema(data['Close'].values, np.greater, order=5)[0] min_idx = argrelextrema(data['Close'].values, np.less, order=5)[0] plt.figure(figsize=(15, 8)) plt.plot(data['Close'], zorder=0) plt.scatter(data.iloc[max_idx].index, data.iloc[max_idx]['Close'], label='Maxima', s=100, color=colors[1], marker='^') plt.scatter(data.iloc[min_idx].index, data.iloc[min_idx]['Close'], label='Minima', s=100, color=colors[2], marker='v') plt.legend() plt.show()  This plot looks much more like what we’d expect when pulling out peaks and troughs. ## Get Consecutive Peaks Our next step is going to look for consecutive peaks or troughs so we can get “higher highs”, “lower lows”, “lower highs”, or “higher lows”. Ultimately we’ll have four different cases to look for which will all follow very similar logic. Right now, let’s just look cases where there are at least two consecutive “higher highs.” To do this, we’re going to have to loop over our indices and check the values against the previous result. If the new peak is greater than the previous peak, we can append it to a list and move on, otherwise we start over with this new peak and look at the next one. from collections import deque # Get K consecutive higher peaks K = 2 high_idx = argrelextrema(data['Close'].values, np.greater, order=5)[0] highs = data.iloc[high_idx]['Close'] extrema = [] ex_deque = deque(maxlen=K) for i, idx in enumerate(high_idx): if i == 0: ex_deque.append(idx) continue if highs[i] < highs[i-1]: ex_deque.clear() ex_deque.append(idx) if len(ex_deque) == K: # K-consecutive higher highs found extrema.append(ex_deque.copy())  From the plot above, we should find two sets of consecutive peaks with K=2. And printing out our extrema list, that’s what we see. print(extrema)  [deque([21, 50], maxlen=2), deque([50, 77], maxlen=2)]  Let’s also plot this: close = data['Close'].values dates = data.index plt.figure(figsize=(15, 8)) plt.plot(data['Close']) _ = [plt.plot(dates[i], close[i], c=colors[1]) for i in extrema] plt.xlabel('Date') plt.ylabel('Price ($)')
plt.title(f'Higher Highs for {ticker} Closing Price')
plt.legend(['Close', 'Consecutive Highs'])
plt.show()


We’ve pulled out our consecutive highs, so now let’s put some functions together to get lower lows, lower highs, and higher lows.

def getHigherLows(data: np.array, order=5, K=2):
'''
Finds consecutive higher lows in price pattern.
Must not be exceeded within the number of periods indicated by the width
parameter for the value to be confirmed.
K determines how many consecutive lows need to be higher.
'''
# Get lows
low_idx = argrelextrema(data, np.less, order=order)[0]
lows = data[low_idx]
# Ensure consecutive lows are higher than previous lows
extrema = []
ex_deque = deque(maxlen=K)
for i, idx in enumerate(low_idx):
if i == 0:
ex_deque.append(idx)
continue
if lows[i] < lows[i-1]:
ex_deque.clear()

ex_deque.append(idx)
if len(ex_deque) == K:
extrema.append(ex_deque.copy())

return extrema

def getLowerHighs(data: np.array, order=5, K=2):
'''
Finds consecutive lower highs in price pattern.
Must not be exceeded within the number of periods indicated by the width
parameter for the value to be confirmed.
K determines how many consecutive highs need to be lower.
'''
# Get highs
high_idx = argrelextrema(data, np.greater, order=order)[0]
highs = data[high_idx]
# Ensure consecutive highs are lower than previous highs
extrema = []
ex_deque = deque(maxlen=K)
for i, idx in enumerate(high_idx):
if i == 0:
ex_deque.append(idx)
continue
if highs[i] > highs[i-1]:
ex_deque.clear()

ex_deque.append(idx)
if len(ex_deque) == K:
extrema.append(ex_deque.copy())

return extrema

def getHigherHighs(data: np.array, order=5, K=2):
'''
Finds consecutive higher highs in price pattern.
Must not be exceeded within the number of periods indicated by the width
parameter for the value to be confirmed.
K determines how many consecutive highs need to be higher.
'''
# Get highs
high_idx = argrelextrema(data, np.greater, order=5)[0]
highs = data[high_idx]
# Ensure consecutive highs are higher than previous highs
extrema = []
ex_deque = deque(maxlen=K)
for i, idx in enumerate(high_idx):
if i == 0:
ex_deque.append(idx)
continue
if highs[i] < highs[i-1]:
ex_deque.clear()

ex_deque.append(idx)
if len(ex_deque) == K:
extrema.append(ex_deque.copy())

return extrema

def getLowerLows(data: np.array, order=5, K=2):
'''
Finds consecutive lower lows in price pattern.
Must not be exceeded within the number of periods indicated by the width
parameter for the value to be confirmed.
K determines how many consecutive lows need to be lower.
'''
# Get lows
low_idx = argrelextrema(data, np.less, order=order)[0]
lows = data[low_idx]
# Ensure consecutive lows are lower than previous lows
extrema = []
ex_deque = deque(maxlen=K)
for i, idx in enumerate(low_idx):
if i == 0:
ex_deque.append(idx)
continue
if lows[i] > lows[i-1]:
ex_deque.clear()

ex_deque.append(idx)
if len(ex_deque) == K:
extrema.append(ex_deque.copy())

return extrema

from matplotlib.lines import Line2D

close = data['Close'].values
dates = data.index

order = 5
K = 2

hh = getHigherHighs(close, order, K)
hl = getHigherLows(close, order, K)
ll = getLowerLows(close, order, K)
lh = getLowerHighs(close, order, K)

plt.figure(figsize=(15, 8))
plt.plot(data['Close'])
_ = [plt.plot(dates[i], close[i], c=colors[1]) for i in hh]
_ = [plt.plot(dates[i], close[i], c=colors[2]) for i in hl]
_ = [plt.plot(dates[i], close[i], c=colors[3]) for i in ll]
_ = [plt.plot(dates[i], close[i], c=colors[4]) for i in lh]
plt.xlabel('Date')
plt.ylabel('Price ($)') plt.title(f'Potential Divergence Points for {ticker} Closing Price') legend_elements = [ Line2D([0], [0], color=colors[0], label='Close'), Line2D([0], [0], color=colors[1], label='Higher Highs'), Line2D([0], [0], color=colors[2], label='Higher Lows'), Line2D([0], [0], color=colors[3], label='Lower Lows'), Line2D([0], [0], color=colors[4], label='Lower Highs') ] plt.legend(handles=legend_elements) plt.show()  Now we have functions to identify higher highs and so forth, to mark out potential divergences in price and our indicators. Looking at the plot, everything looks fine except for the Higher Lows line, which seems to skip over a few low points that are indeed higher than the starting point and lower than the ending point. What’s going on? Thankfully, there’s no bug. What’s happening is that these local minima don’t fit our precise rules. For the argrelextrema, we provide an argument called order which looks that many points to the left and right to find a min or max. In our case, we set this to 5 and it just so happens that a number of these minima are 5 points apart from the last one, meaning they’re excluded because they don’t satisfy this criteria (change order to 3 or 4 in the code above and see how the plot changes). Another thing to keep in mind, when we’re running a vectorized backtest, we need to wait order-number of periods before we can confirm a higher low, or whatever it is we’re looking for to avoid lookahead bias. We can plot our confirmations to illustrate this delay: from datetime import timedelta close = data['Close'].values dates = data.index order = 5 K = 2 hh = getHigherHighs(close, order, K) hl = getHigherLows(close, order, K) ll = getLowerLows(close, order, K) lh = getLowerHighs(close, order, K) plt.figure(figsize=(15, 8)) plt.plot(data['Close']) _ = [plt.plot(dates[i], close[i], c=colors[1]) for i in hh] _ = [plt.plot(dates[i], close[i], c=colors[2]) for i in hl] _ = [plt.plot(dates[i], close[i], c=colors[3]) for i in ll] _ = [plt.plot(dates[i], close[i], c=colors[4]) for i in lh] _ = [plt.scatter(dates[i[-1]] + timedelta(order), close[i[-1]], c=colors[1], marker='^', s=100) for i in hh] _ = [plt.scatter(dates[i[-1]] + timedelta(order), close[i[-1]], c=colors[2], marker='^', s=100) for i in hl] _ = [plt.scatter(dates[i[-1]] + timedelta(order), close[i[-1]], c=colors[3], marker='v', s=100) for i in ll] _ = [plt.scatter(dates[i[-1]] + timedelta(order), close[i[-1]], c=colors[4], marker='v', s=100) for i in lh] plt.xlabel('Date') plt.ylabel('Price ($)')
plt.title(f'Potential Divergence Points for {ticker} Closing Price')
legend_elements = [
Line2D([0], [0], color=colors[0], label='Close'),
Line2D([0], [0], color=colors[1], label='Higher Highs'),
Line2D([0], [0], color='w',  marker='^',
markersize=10,
markerfacecolor=colors[1],
label='Higher High Confirmation'),
Line2D([0], [0], color=colors[2], label='Higher Lows'),
Line2D([0], [0], color='w',  marker='^',
markersize=10,
markerfacecolor=colors[2],
label='Higher Lows Confirmation'),
Line2D([0], [0], color=colors[3], label='Lower Lows'),
Line2D([0], [0], color='w',  marker='v',
markersize=10,
markerfacecolor=colors[3],
label='Lower Lows Confirmation'),
Line2D([0], [0], color=colors[4], label='Lower Highs'),
Line2D([0], [0], color='w',  marker='v',
markersize=10,
markerfacecolor=colors[4],
label='Lower Highs Confirmation')
]
plt.legend(handles=legend_elements)
plt.show()


These extrema can be traded in a variety of ways. We can use these as a proxy for momentum by taking the slope of the extrema one way or another. This slope then could be used to rank-order securities by their momentum to buy the highest momentum stocks, or those that are above a given threshold.

Many traders look at applying these extrema to both price and the indicator. If the price makes higher highs and the indicator makes lower highs, then we have a divergence on our hands! We can test these scenarios by applying the exact same functions above to RSI, Stochastic Oscillators, and so forth to build more complex strategies.

We can also use these rules for confirmation of trend. Maybe there’s a reliable signal if both RSI and price make higher highs together?

Of course, we just used these functions on the closing price, but there’s no reason you couldn’t also apply them the open, high, or low for your bars.

All of this can get very complex very quickly. We’re building a platform to allow you to easily test your ideas without coding. You can run your models in an event-driven backtest on professional quality data, get all of your stats, and deploy your model to the market with the click of a button. Just drop your email into the box below to get alerts when we go live!

## How to Trade with the MACD

The Moving Average Convergence-Divergence (MACD, sometimes pronounced “Mac-Dee”) is a very popular indicator that is frequently used in momentum and trend following systems. In short, the MACD is a trailing indicator that gives an indication of the general price trend of a security.

### TL;DR

We walk through the reasoning and math behind the MACD along with Python code so you can learn to apply it yourself.

### How the MACD Works

Despite the intimidating name, the MACD is relatively straightforward to understand and calculation. It takes some ideas from other indicators, namely EMA and moving average cross-over strategies, and combines them into a single, easy to use value.

In its most basic form, we have two EMA signals, a fast one and a slow one (12 and 26-days are popularly chosen). These are calculated every day, then we subtract the fast one from the slow one to get the difference. In psuedocode we have:

1. Calculate fast EMA:
• EMA_fast[t] = (Price - EMA_fast[t-1]) * 2 / (N_fast + 1) + EMA_fast[t-1]
2. Calculate slow EMA:
• EMA_slow[t] = (Price - EMA_slow[t-1]) * 2 / (N_slow + 1) + EMA_slow[t-1]
3. Subtract the two to get MACD:
• MACD[t] = EMA_fast[t] - EMA_slow[t]

Or, if you prefer to write it mathematically:

EMA_t = \big( P_t - EMA_{t-1} \big) \frac{2}{N+1} + EMA_{t-1}
MACD_t = EMA_t^{fast} - EMA_t^{slow}

It’s really that easy.

The MACD is called the Moving Average Convergence-Divergence. What are converging and diverging, are the two EMAs. As the short-term EMA and long-term EMA converge, they get closer to the same value and the indicator moves to 0. Divergence is driven by the short-term EMA moving up or down causing the distance between the two EMAs to move farther apart. There are other ways the terms convergence and divergence are used with this indicator which we’ll get to below.

The most basic way to trade it is buy when MACD > 0, and sell/short when MACD < 0. When it’s positive, we have the faster EMA above the longer EMA, and vice versa when it goes negative. This set up is a basic, exponential moving average crossover strategy.

## MACD Example in Python

To code it, we just need a few basic packages.

import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt


The details for the EMA calculation are given here, so we’ll just show the final code which we’ll leverage in our model.

def _calcEMA(P, last_ema, N):
return (P - last_ema) * (2 / (N + 1)) + last_ema

def calcEMA(data: pd.DataFrame, N: int, key: str = 'Close'):
# Initialize series
data['SMA_' + str(N)] = data[key].rolling(N).mean()
ema = np.zeros(len(data)) + np.nan
for i, _row in enumerate(data.iterrows()):
row = _row[1]
if np.isnan(ema[i-1]):
ema[i] = row['SMA_' + str(N)]
else:
ema[i] = _calcEMA(row[key], ema[i-1], N)

data['EMA_' + str(N)] = ema.copy()
return data


With the calcEMA function above, we can easily write our MACD function. We just need to call calcEMA twice with the fast and slow parameters, and subtract the values.

def calcMACD(data: pd.DataFrame, N_fast: int, N_slow: int):
assert N_fast < N_slow, "Fast EMA must be less than slow EMA parameter."
data = calcEMA(data, N_fast)
data = calcEMA(data, N_slow)
# Subtract values to get MACD
data['MACD'] = data[f'EMA_{N_fast}'] - data[f'EMA_{N_slow}']
return data


We’re ready to test it. I just grabbed a one year time period from a random stock in the S&P 500 to illustrate how the indicator works. We’ll just run it through our function and take a look at the output.

ticker = 'DTE'
start = '2013-01-01'
end = '2014-01-01'
yfObj = yf.Ticker(ticker)
df = yfObj.history(start=start, end=end)
N_fast = 12
N_slow = 26
data = calcMACD(df, N_fast, N_slow)
# Drop extra columns
data.drop(['Open', 'High', 'Low', 'Volume', 'Dividends', 'Stock Splits'], axis=1, inplace=True)
data.iloc[N_slow-5:N_slow+5]


As you can see in the table above, our function provides the different EMA values (and the SMAs they’re initialized on) in addition to the MACD. We can plot these values below to see how they track with one another.

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

ax[0].plot(data['Close'], label=f'{ticker}', linestyle=':')
ax[0].plot(data[f'EMA_{N_fast}'], label=f'EMA-{N_fast}')
ax[0].plot(data[f'EMA_{N_slow}'], label=f'EMA-{N_slow}')
ax[0].set_ylabel('Price ($)') ax[0].set_title(f'Price and EMA Values for {ticker}') ax[0].legend() ax[1].plot(data['MACD'], label='MACD') ax[1].set_ylabel('MACD') ax[1].set_xlabel('Date') ax[1].set_title(f'MACD for {ticker}') plt.show()  We have a few good oscillations around 0, which, in our simplest strategy, would indicate buy/sell signals as the EMAs cross over. You can also see that the initial price rise from January through May has a high MACD (>0.5), and the MACD peak roughly corresponds with the peak in the price of our security. The MACD begins to fall and turn negative, which follows a healthy retracement in the price before it rebounds again. Also, note that as the MACD becomes more positive, this means that the short-term EMA is increasing more rapidly than the longer, slower EMA we’re comparing it to. This is a sign of stronger upward momentum in the price action. The opposite holds true when the MACD becomes more negative. There’s another way to generate signals from the MACD, however. This method is based on using a signal line in addition to the MACD as we calculated it. ### The MACD with a Signal Line The signal line is a EMA of the MACD signal we calculated. Frequently this will be a 9-day EMA to go along with a 12 and 26-day EMA like we calculated above. Writing the psuedocode, we just need to add one more step: 1. Calculate the Signal Line (SL) as the EMA of the MACD: • SL[t] = (MACD[t] - EMA_MACD[t-1]) * 2 / (N_SL + 1) + EMA_MACD[t-1] Again, we can write this mathematically as: SL_t = \big( MACD_t - EMA_{t-1}^{MACD} \big) \frac{2}{N^{SL}+1} + EMA_{t-1}^{MACD} Let’s modify our calcMACD function to allow for signal line calculation too. def calcMACD(data: pd.DataFrame, N_fast: int, N_slow: int, signal_line: bool = True, N_sl: int = 9): assert N_fast < N_slow, "Fast EMA must be less than slow EMA parameter." # Add short term EMA data = calcEMA(data, N_fast) # Add long term EMA data = calcEMA(data, N_slow) # Subtract values to get MACD data['MACD'] = data[f'EMA_{N_fast}'] - data[f'EMA_{N_slow}'] if signal_line: data = calcEMA(data, N_sl, key='MACD') # Rename columns data.rename( columns={f'SMA_{N_sl}': f'SMA_MACD_{N_sl}', f'EMA_{N_sl}': f'SignalLine_{N_sl}'}, inplace=True) return data  Now, we will run this new function and plot it to show how the signal line looks versus the MACD. N_fast = 12 N_slow = 26 N_sl = 9 signal_line = True data = calcMACD(df, N_fast, N_slow, signal_line, N_sl) # Plot MACD and Signal Line fig, ax = plt.subplots(figsize=(15, 8)) ax.plot(data['MACD'], label='MACD') ax.plot(data[f'SignalLine_{N_sl}'], label='Signal Line') ax.set_ylabel('MACD') ax.set_xlabel('Date') ax.set_title(f'MACD and Signal Line for {ticker}') ax.legend() plt.show()  In the plot above, we can see that the signal line intersects with the MACD on a number of occasions. As you have probably guessed, these intersections provide buy and sell signals as well. You can buy when the MACD crosses above the signal line and sell/short when it goes below. ## MACD Momentum Often times, you’ll see MACD charts with bars as well as lines on the graph like this: data['MACD_bars'] = data['MACD'] - data[f'SignalLine_{N_sl}'] colors = plt.rcParams['axes.prop_cycle'].by_key()['color'] fig, ax = plt.subplots(1, figsize=(15, 8), sharex=True) # Re-index dates to avoid gaps over weekends _x_axis = np.arange(data.shape[0]) month = pd.Series(data.index.map(lambda x: x.month)) x = month - month.shift(1) _x_idx = np.where((x.values!=0) & (~np.isnan(x.values)))[0] ax.plot(_x_axis, data['MACD'], label='MACD') ax.plot(_x_axis, data[f'SignalLine_{N_sl}'], label='Signal Line', c=colors[2]) ax.bar(_x_axis, data['MACD_bars'], label='MACD Bars', color=colors[4], width=1, edgecolor='w') ax.set_xticks(_x_axis[np.where(~np.isnan(_x_dates))]) ax.set_xticks(_x_idx) ax.set_xticklabels(data.index[_x_idx].map( lambda x: datetime.strftime(x, '%Y-%m-%d')), fontsize=10) ax.legend() plt.show()  The MACD bars, also called MACD histogram, plots the difference between the MACD and the signal line. These bars can be used by chartists (those who read charts to make trades) to determine the strength of the momentum and make buy sell decisions. If the bars are growing in size, then momentum is increasing as the MACD is pulling away from the signal line. Additionally, if the bars begin shrinking, we could be seeing a reversal coming. We’re focused on automatic and systematic trading, so we don’t want to spend time looking at bars to see if they’re growing or shrinking. Rather, we want the computer to do it for us! Thankfully, this is only a few small steps away. ### Getting Momentum from the MACD Histogram First, we need to determine how many consecutive days we want before we determine whether or not we want to enter a position. For our illustration here, we’ll use three days. From there, we can get the daily changes by using the diff() method and then use the np.sign() function to determine if it’s positive or negative (note that this function gives 0s a positive sign). We’ll call this column Growth. After that, we use Pandas’ handy rolling() method to sum our Growth column. This will give us 3 if we have three consecutive up days, -3 if we have three straight down days, or some other value between -2 and 2, making it easy for us to pick out our signals. For lack of a better name, let’s call this column Consecutive_Bars. Finally, we’ll get our position by looking for all of those +/-3 values. We want to go long on increasing upward momentum and short on decreasing downward momentum, so we just multiply the sign of our Consecutive_Bars column by 1 if the absolute value of Consecutive_Bars equals the number of consecutive days. The code for all of this is given below. N_consecutive = 3 data['Growth'] = np.sign(data['MACD_bars'].diff(1)) data['Consecutive_Bars'] = data['Growth'].rolling(N_consecutive).sum() data['Position'] = data['Consecutive_Bars'].map( lambda x: np.sign(x) * 1 if np.abs(x) == N_consecutive else 0) long_idx = data['Position']==1 short_idx = data['Position']==-1 fig, ax = plt.subplots(2, figsize=(15, 8), sharex=True) ax[0].plot(_x_axis, data['Close'], label=f'{ticker}', linestyle=':') ax[0].scatter(_x_axis[long_idx], data.loc[long_idx]['Close'], label='Longs', c=colors[3]) ax[0].scatter(_x_axis[short_idx], data.loc[short_idx]['Close'], label='Shorts', c=colors[1]) ax[0].set_ylabel('Price ($)')
ax[0].set_title(f'{ticker} and Long/Short Positions based on MACD Bars')
ax[0].legend()

ax[1].bar(_x_axis, data['MACD_bars'], label='MACD Bars',
color=colors[4], width=1, edgecolor='w')
ax[1].scatter(_x_axis[long_idx], data.loc[long_idx]['MACD_bars'],
label='Longs', c=colors[3])
ax[1].scatter(_x_axis[short_idx], data.loc[short_idx]['MACD_bars'],
label='Shorts', c=colors[1])
ax[1].plot(_x_axis, data['MACD'], label='MACD')
ax[1].plot(_x_axis, data['SignalLine_9'], label='Signal Line',
color=colors[2])
ax[1].set_ylabel('MACD')
ax[1].set_title('MACD, Signal Line, MACD Bars, and Long/Short Positions')

ax[1].set_xticks(_x_axis[np.where(~np.isnan(_x_dates))])
ax[1].set_xticks(_x_idx)
ax[1].set_xticklabels(data.index[_x_idx].map(
lambda x: datetime.strftime(x, '%Y-%m-%d')),
fontsize=10)

ax[1].legend()
plt.show()


There’s a lot going on in the plots above!

In the first panel, we simply have the price action for our stock and the long/short positions overlaid so you can get a feel for what the stock may be doing when this indicator chooses to go long or short. It starts off catching the end of an uptrend and getting out near the peak, then waits a bit before entering to go short while the stock flat-lines.

As you can see working through the rest of the chart – like any indicator – it has a few mistakes. This indicator also lags the price action by quite a bit. Not only are you working off of three different EMAs, but then you wait for three consecutive days before putting a position on. This allows it to catch some strong trends, but because we were too quick to exit positions (e.g. wait for a single down day in the MACD bars to get out of a long position) we miss out on a good chunk of the moves, even though we were positionally correct. So, it looks like this could be useful, but we may need a better exit strategy to make the most of it.

In the second panel you see the MACD, signal line, MACD bars, and our longs/shorts. We derive the MACD from the price action and mimics it to some extent. The signal line smooths out some of the MACD moves and looks like a wave that is slightly out of phase. You can see our signal on the bars themselves too.

## Some concluding thoughts

The MACD is a very popular indicator because of its flexibility and diversity. Here, we discussed just a handful of ways we can interpret the values to develop trading systems, but there are others out there we’ll address in the future.

While developing systems with the MACD, keep in mind that the indicator itself is unbounded, meaning it has no ceiling or floor. This is unlike other oscillating signals like RSI, which have a maximum and a minimum value. Moreover, the value is dependent on price. While our example ranged from a little over 1 to -1, you could have more expensive securities that produce MACD values in the teens or hundreds. This is important because it becomes very difficult – if not impossible – to compare MACD values directly across different assets.

The 12-26-9 MACD with a signal line is a very popular set up, but it isn’t the only one. You’re free to experiment, but note that the farther the short term and long term EMAs move from one another, the larger MACD values you’re going to get. Not that this is necessarily a problem, but it is something to keep an eye on; you may need to adjust some other parameters in your system to compensate.

There’s a lot to take in with this indicator and a lot of moving parts to keep track of. At Raposa, we make this easy for you. You can pick your stocks, set your parameters, and let us handle all of the details for you. We’ll provide you the stats on your backtests and give live signals when you’re ready to trade it in the real world. Sign up below to learn more.