How To Improve Your Trading System With The Kelly Criterion

How much should you invest in a given trade?

Should you put in 1-2% of your trading capital or YOLO your way to the moon by putting everything you've got into it (and maybe some leverage for good measure)?

It all comes down to the odds of course. And thankfully, there's a precise formula for calculating your position size called the Kelly Criterion.

The Kelly Criterion gives an optimal result for betting based on the probability of winning a bet and how much you receive for winning. If you check out Wikipedia or Investopedia, you’ll see formulas like this:

$$f^* = p - \frac{1-p}{b-1}$$

which gives you the optimal amount to bet (f*) given the probability of winning (p) and the payout you’re given for the bet (b). For example, if you have a bet that gives you a 52% chance of winning and you have 1:1 odds (bet 1 to win 1 and your money back, so you get a total payout of 2), then you should bet 4% of your capital on each game (0.52–0.48/(2–1) = 0.04).

This is fine for a binary, win-lose outcome. The trouble is, investing in stocks don’t follow this simple model. If you make a winning trade, you could get a 10% return, 8% return, 6.23%, 214%, or any other value.
So how do we change our binary formula to a continuous model?

Continuous Kelly

Ed Thorpe and Claude Shannon (yes, the Claude Shannon for us nerds out there) used the Kelly Criterion to manage their black jack bankroll and clean up in Vegas in the 60's. Seeing the applicability of this method, Thorpe extended it to Wall Street using it to manage his investments while running his own hedge fund. Developing a continuous Kelly model isn’t actually very straightforward, but Thorpe offers this series of steps which you can find in Section 7 here.

Most people probably don’t care about the derivation, so we’ll just jump to this new model which winds up being deceptively simple:

$$f^* = \frac{\mu- r}{\sigma^2}$$

Here, μ are the mean returns, r is the risk-free rate, and σ is the standard deviation of returns. All of this looks very much like the Sharpe Ratio, but instead of dividing by the standard deviation, we divide by the variance.

Now that we have our new formula, let’s put an example strategy or two together to see how it works in theory.

Long-Only Portfolio with the Kelly Criterion

We’re going to start simple to put this into practice with a long-only portfolio that will rebalance based on the Kelly Criterion. When trading with Kelly position sizing, there are a few things to keep in mind.

First, our Kelly factor (f*) can go over 1, which implies use of leverage. This may or may not be feasible depending on the trader and leverage may incur additional borrowing costs. Clearly, it increases your risk and volatility as well so not everyone will want to trade with leverage.

Second, we need to estimate the mean and standard deviation of our security. This can be affected by our look-back period. Do we use one year? One month? Or some other time period? One year seems to be standard, so we’ll start with that. Also, the time period you choose may be related to the speed of your trading strategy. If you’re trading minute-by-minute, perhaps a one-month look-back period makes more sense.

Finally, we need to determine how frequently we’ll rebalance our portfolio according to our updated Kelly factor. This can create a lot of transaction costs if it occurs too frequently. There could also be tax implications associated with selling positions quickly, which are beyond the scope of this model. For our purposes, we’ll rebalance every day so that we’re always holding the estimated, optimal position.

Ok, with that out of the way, let’s get some Python packages.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf

Calculating the Kelly Criterion for Stocks

Our first function is going to be straightforward. All we’re doing is plugging the mean, standard deviation, and our interest rate into that formula above and returning f-star.

def calcKelly(mean, std, r):
  return (mean - r) / std**2

Our example is going to be a simple, vectorized backtest to quickly give you the flavor of using the Kelly formula for position sizing. We’ll write one more helper function that will take our returns as a Pandas Series and return the f-star for each time step.

def getKellyFactor(returns: pd.Series, r=0.01, 
  max_leverage=None, periods=252, rolling=True):
  '''
  Calculates the Kelly Factor for each time step based
  on the parameters provided.
  '''
  if rolling:
    std = returns.rolling(periods).std()
    mean = returns.rolling(periods).mean()
  else:
    std = returns.expanding(periods).std()
    mean = returns.expanding(periods).mean()

  r_daily = np.log((1 + r) ** (1 / 252))
  kelly_factor = calcKelly(mean, std, r_daily)
  # No shorts
  kelly_factor = np.where(kelly_factor<0, 0, kelly_factor)
  if max_leverage is not None:
    kelly_factor = np.where(kelly_factor>max_leverage,
      max_leverage, kelly_factor)
    
  return kelly_factor

Let’s use some real data to demonstrate this. I chose the SPY, which is the S&P 500 ETF which was introduced back in 1993. We can get it’s history through 2020 from yfinance with the code below:

ticker = 'SPY'
yfObj = yf.Ticker(ticker)
data = yfObj.history(start='1993-01-01', end='2020-12-31')
# Drop unused columns
data.drop(['Open', 'High', 'Low', 'Volume', 'Dividends', 
  'Stock Splits'], axis=1, inplace=True)

Now we’re ready to put this into action in a strategy. A key assumption stated by Thorpe in the Kelly formula above is the lack of short selling. With that in mind, we’re going to start with the simplest strategy I can think of, a buy-and-hold strategy that will rebalance the portfolio between cash and equities according the Kelly factor.

For this, we’re going to assign a percentage of our total portfolio to the SPY based on the Kelly factor. If leverage increases above 1, then we will have a negative cash value to reflect our borrowing. I didn’t get very granular, so I just used the same interest rate from 1993–2020 and assumed you can borrow and lend at that same rate — which is really a ridiculous assumption because nobody is going to let you, dear retail investor, pay 1% to lever up your stock portfolio (commodities, FOREX, crypto, and other markets typically have more leverage available). You can get daily 10-year rates or whatever treasury instrument suits you as a baseline from FRED, the US Treasury site, or your favorite data provider to use real data in your Kelly calculation.

Let’s get to the function.

def LongOnlyKellyStrategy(data, r=0.02, max_leverage=None, periods=252, 
  rolling=True):
  data['returns'] = data['Close'] / data['Close'].shift(1)
  data['log_returns'] = np.log(data['returns'])
  data['kelly_factor'] = getKellyFactor(data['log_returns'], 
    r, max_leverage, periods, rolling)
  cash = np.zeros(data.shape[0])
  equity = np.zeros(data.shape[0])
  portfolio = cash.copy()
  portfolio[0] = 1
  cash[0] = 1
  for i, _row in enumerate(data.iterrows()):
    row = _row[1]
    if np.isnan(row['kelly_factor']):
      portfolio[i] += portfolio[i-1]
      cash[i] += cash[i-1]
      continue

    portfolio[i] += cash[i-1] * (1 + r)**(1/252) + equity[i-1] * row['returns']
    equity[i] += portfolio[i] * row['kelly_factor']
    cash[i] += portfolio[i] * (1 - row['kelly_factor'])

  data['cash'] = cash
  data['equity'] = equity
  data['portfolio'] = portfolio
  data['strat_returns'] = data['portfolio'] / data['portfolio'].shift(1)
  data['strat_log_returns'] = np.log(data['strat_returns'])
  data['strat_cum_returns'] = data['strat_log_returns'].cumsum()
  data['cum_returns'] = data['log_returns'].cumsum()

  return data

We’re going to run a max-leverage strategy and compare it to a baseline, buy-and-hold.

kelly = LongOnlyKellyStrategy(data.copy())

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)

ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold')
ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100, label='Kelly Model')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Buy-and-hold and Long-Only Strategy with Kelly Sizing')
ax[0].legend()

ax[1].plot(kelly['kelly_factor'])
ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

Our model blew up spectacularly! Without any constraints on our leverage, it quickly shot up to 50x leverage — and did great for a bit — but then hit some major losses and plummeted below the baseline. It wound up losing our initial investment fairly quickly.

Important lesson here, while the Kelly Criterion may give you the optimal allocation to trade with, it is only as good as the assumptions that underpin it. Note that this is not a predictive tool, we’re looking back in time and assuming the previous year’s volatility is a good guide going forward.

We can do a few different things to de-risk this while gaining the benefits of a better money management strategy. First, we can cap our leverage to something more reasonable, say 3 or 4x (although I’d even keep it lower myself). We also want to see some risk metrics associated with this.

First, we'll define getStratStats() as a helper function to calculate Sharpe Ratio, drawdowns, and the like, then we'll loop through our model with various leverage caps in place.

def getStratStats(log_returns: pd.Series, risk_free_rate: float = 0.02):
  stats = {}  # Total Returns
  stats['tot_returns'] = np.exp(log_returns.sum()) - 1  
  
  # Mean Annual Returns
  stats['annual_returns'] = np.exp(log_returns.mean() * 252) - 1  
  
  # Annual Volatility
  stats['annual_volatility'] = log_returns.std() * np.sqrt(252)  
  
  # Sortino Ratio
  annualized_downside = log_returns.loc[log_returns<0].std() * \
    np.sqrt(252)
  stats['sortino_ratio'] = (stats['annual_returns'] - \
    risk_free_rate) / annualized_downside  
  
  # Sharpe Ratio
  stats['sharpe_ratio'] = (stats['annual_returns'] - \
    risk_free_rate) / stats['annual_volatility']  
  
  # Max Drawdown
  cum_returns = log_returns.cumsum() - 1
  peak = cum_returns.cummax()
  drawdown = peak - cum_returns
  max_idx = drawdown.argmax()
  stats['max_drawdown'] = 1 - np.exp(cum_returns[max_idx]) / np.exp(peak[max_idx])
  
  # Max Drawdown Duration
  strat_dd = drawdown[drawdown==0]
  strat_dd_diff = strat_dd.index[1:] - strat_dd.index[:-1]
  strat_dd_days = strat_dd_diff.map(lambda x: x.days).values
  strat_dd_days = np.hstack([strat_dd_days,
    (drawdown.index[-1] - strat_dd.index[-1]).days])
  stats['max_drawdown_duration'] = strat_dd_days.max()

  return stats

Putting it all together:

max_leverage = np.arange(1, 6)

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)
data_dict = {}
df_stats = pd.DataFrame()

for l in max_leverage:
  kelly = LongOnlyKellyStrategy(data.copy(), max_leverage=l)
  data_dict[l] = kelly.copy()
  
  ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100,
             label=f'Max Leverage = {l}')
  ax[1].plot(kelly['kelly_factor'], label=f'Max Leverage = {l}')
  stats = getStratStats(kelly['strat_log_returns'])
  df_stats = pd.concat([df_stats, 
    pd.DataFrame(stats, index=[f'Leverage={l}'])])

ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold', 
           linestyle=':')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Buy-and-hold and Long-Only Strategy with Kelly Sizing')
ax[0].legend()

ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

# View statistics
stats = pd.DataFrame(getStratStats(kelly['log_returns']), index=['Buy and Hold'])
df_stats = pd.concat([stats, df_stats])
df_stats

There’s a lot to unpack here. For clarity, a leverage ratio of 1, means that we aren’t actually using any leverage, we’re maxing out by putting all of our capital into the SPY. This more conservative model, winds up with lower total returns than the buy and hold approach, but it has a shorter drawdown with less volatility because it will scale back after volatility increases. Keep in mind that the buy and hold allocates all of our capital to the S&P 500, then never touches that for nearly three decades. So the primary difference between these two approaches is that the Kelly Criterion model with leverage capped at 1 will allocate some capital to cash after volatile periods.

Allowing leverage to max out at 2x increases the total returns over buy-and-hold, but with more volatility and lower risk-adjusted returns. It’s a bit tough to see these models in the plot above, so we’ll zoom in on the these results below.

fig, ax = plt.subplots(2, figsize=(15, 8), sharex=True)

ax[0].plot(np.exp(data_dict[1]['strat_cum_returns'])*100, 
         label='Max Leverage=1')
ax[0].plot(np.exp(data_dict[2]['strat_cum_returns'])*100, 
         label='Max Leverage=2')
ax[0].plot(np.exp(data_dict[1]['cum_returns'])*100, 
         label='Buy and Hold', linestyle=':')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Low-Leverage and Baseline Models')
ax[0].legend()

ax[1].plot(data_dict[1]['kelly_factor'], label='Max Leverage = 1')
ax[1].plot(data_dict[2]['kelly_factor'], label='Max Leverage = 2')
ax[1].set_xlabel('Date')
ax[1].set_ylabel('Leverage')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

Here, we can more clearly see the periods where the Kelly Criterion gets out of the market and goes to cash as the volatility increases during large drawdowns. The worst of the crashes in 2000 and 2008 are avoided. The COVID crash in 2020, however, was much more rapid and wound up leading to the big losses — particularly the more leveraged the strategy was — as the strategy stayed in the market during the worst of it and got crushed.

Unsurprisingly, the more highly leveraged models all had more volatility with much larger drawdowns. They gave us higher total returns, but the risk metrics all show they did so with a lot more risk.

I had mentioned a few other ways to control leverage apart from tight caps. Let's take a look at some of these and see if that helps our returns.

Larger Sample Size

We can increase the sample size by increasing the number of periods we use for calculating the mean and standard deviation of our returns. This will lead to a slower reacting model, but may lead to better estimates and thus better results.

I'm going to still use a cap here, but I'll put it at 3x - which is roughly what most brokers are going to limit retail traders to when trading equities and ETFs.

max_leverage = 3

periods = 252 * np.arange(1, 5)

fig, ax = plt.subplots(2, figsize=(15, 10), sharex=True)
data_dict = {}
df_stats = pd.DataFrame()
for p in periods:
  p = int(p)
  kelly = LongOnlyKellyStrategy(data.copy(), periods=p,
      max_leverage=max_leverage)
  data_dict[p] = kelly.copy()
  ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100,
             label=f'Days = {p}')
  ax[1].plot(kelly['kelly_factor'], label=f'Days = {p}', linewidth=0.5)
  stats = getStratStats(kelly['strat_log_returns'])
  df_stats = pd.concat([df_stats, 
    pd.DataFrame(stats, index=[f'Days={p}'])])

ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold', 
           linestyle=':')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
    'Buy-and-hold and Long-Only Strategy with Kelly Sizing ' +
    'and Variable Lookback Periods')
ax[0].legend()

ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

stats = pd.DataFrame(getStratStats(kelly['log_returns']), index=['Buy and Hold'])
df_stats = pd.concat([stats, df_stats])
df_stats

These longer lookback periods tend to increase the time out of market, which extends the drawdown durations in many cases. But they wind up with higher overall returns, at least up to the 4-year lookback period, One thing that makes these comparisons somewhat unfair is that they require more time and data before they enter the market in the first place. So the SPY gets to compounding immediately, the 1-year model gets going after 252 trading days have passed, and so forth. Regardless, we still see good gains from these longer term systems.

One more example of this before moving on. I included an argument called rolling in our function that defaults to True. It's almost always better to include more data in our estimates, so if we set this to False, we switch out a rolling horizon for an expanding horizon calculation. This latter approach will be identical to the rolling horizon model at day 252 under the standard settings, but then will diverge because it doesn't drop old data. So the mean and standard deviation of the SPY will contain the last 500 data points at day 500. This quickly increases our sample size as the model moves forward in time. In addition, the number of periods we use now serves as a minimum number of data points required before we get a result for our Kelly factor.

kelly = LongOnlyKellyStrategy(data.copy(), rolling=False)

fig, ax = plt.subplots(2, figsize=(12, 8), sharex=True)

ax[0].plot(np.exp(kelly['cum_returns']) * 100, label='Buy and Hold')
ax[0].plot(np.exp(kelly['strat_cum_returns']) * 100, label='Kelly Model')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title(
    'Buy-and-hold and Long-Only Strategy with Kelly Sizing ' +
    'and Expanding Horizon')
ax[0].legend()

ax[1].plot(kelly['kelly_factor'])
ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Expanding Kelly Factor')

plt.tight_layout()
plt.show()

In this case, we see that the Kelly factor largely decreases over time as the sample grows. It does begin to increase after 2008, however, it performed very poorly after the 2000 crash — which it was highly levered heading into — and was never able to recover.

Excessive leverage got this model into hot water as well and blew up the account. Feel free to try the expanding approach with a leverage cap in place to see how it works.

Personally, I prefer the rolling horizon - even if it does mean we drop data - because it is more adaptive to market regimes. The market goes through periods of high and low volatility, so including all of this data in the volatility estimates mutes recent performance. This can be seen by comparing the Kelly factor in the last plot in 2020 with the other plots. Here, it barely drops as volatility skyrockets, whereas in the other, more responsive models, the Kelly factor drops so severely that most go completely into cash.

Applying the Kelly Criterion to a Trading Strategy

So far we have just looked at applying the Kelly Criterion to a single asset to manage our cash-equity balance. What if we want to use it with a trading strategy on a single asset? Could we improve our risk adjusted returns in this scenario.
The answer is “yes,” but we do need to be careful in how we apply it. If we run the risk of blowing up thanks to leverage while trading the S&P 500, this can be even more of a risk with a given trading strategy.

We’re going to keep things as simple as possible here and run a moving average cross-over strategy. Again, we’ll keep all the assumptions about liquidity, leverage, and no short selling that we laid out above.

There are a couple of changes we’ll need to make to run this. First, we will need to figure out our position, which is just going to be when the short-term SMA is above the long-term SMA.

Second, instead of using the mean and standard deviation of the underlying asset, we’re going to rely on the stats from our strategy. This will use the stats from using our strategy without position sizing.

The code is given below and is similar to what you saw in our long-only strategy.

# Kelly money management for trading strategy
def KellySMACrossOver(data, SMA1=50, SMA2=200, r=0.01, 
  periods=252, max_leverage=None, rolling=True):
  '''
  Sizes a simple moving average cross-over strategy according
  to the Kelly Criterion.
  '''
  data['returns'] = data['Close'] / data['Close'].shift(1)
  data['log_returns'] = np.log(data['returns'])
  # Calculate positions
  data['SMA1'] = data['Close'].rolling(SMA1).mean()
  data['SMA2'] = data['Close'].rolling(SMA2).mean()
  data['position'] = np.nan
  data['position'] = np.where(data['SMA1']>data['SMA2'], 1, 0)
  data['position'] = data['position'].ffill().fillna(0)
  data['_strat_returns'] = data['position'].shift(1) * \
    data['returns']
  data['_strat_log_returns'] = data['position'].shift(1) * \
    data['log_returns']
  # Calculate Kelly Factor using the strategy's returns
  kf = getKellyFactor(data['_strat_log_returns'], r, 
    max_leverage, periods, rolling)
  data['kelly_factor'] = kf
  
  cash = np.zeros(data.shape[0])
  equity = np.zeros(data.shape[0])
  portfolio = cash.copy()
  portfolio[0] = 1
  cash[0] = 1
  for i, _row in enumerate(data.iterrows()):
    row = _row[1]
    if np.isnan(kf[i]):
      portfolio[i] += portfolio[i-1]
      cash[i] += cash[i-1]
      continue
    
    portfolio[i] += cash[i-1] * (1 + r)**(1/252) + equity[i-1] * row['returns']
    equity[i] += portfolio[i] * row['kelly_factor']
    cash[i] += portfolio[i] * (1 - row['kelly_factor'])

  data['cash'] = cash
  data['equity'] = equity
  data['portfolio'] = portfolio
  data['strat_returns'] = data['portfolio'] / data['portfolio'].shift(1)
  data['strat_log_returns'] = np.log(data['strat_returns'])
  data['strat_cum_returns'] = data['strat_log_returns'].cumsum()
  data['cum_returns'] = data['log_returns'].cumsum()
  return data

Let’s see how the model performs vs the baseline with moderate leverage.

kelly_sma = KellySMACrossOver(data.copy(), max_leverage=3)

fig, ax = plt.subplots(2, figsize=(15, 8), sharex=True)

ax[0].plot(np.exp(kelly_sma['cum_returns']) * 100, label='Buy-and-Hold')
ax[0].plot(np.exp(kelly_sma['strat_cum_returns'])* 100, label='SMA-Kelly')
ax[0].plot(np.exp(kelly_sma['_strat_log_returns'].cumsum()) * 100, label='SMA')
ax[0].set_ylabel('Returns (%)')
ax[0].set_title('Moving Average Cross-Over Strategy with Kelly Sizing')
ax[0].legend()

ax[1].plot(kelly_sma['kelly_factor'])
ax[1].set_ylabel('Leverage')
ax[1].set_xlabel('Date')
ax[1].set_title('Kelly Factor')

plt.tight_layout()
plt.show()

sma_stats = pd.DataFrame(getStratStats(kelly_sma['log_returns']), 
                         index=['Buy and Hold'])
sma_stats = pd.concat([sma_stats,
            pd.DataFrame(getStratStats(kelly_sma['strat_log_returns']),
              index=['Kelly SMA Model'])])
sma_stats = pd.concat([sma_stats,
            pd.DataFrame(getStratStats(kelly_sma['_strat_log_returns']),
              index=['SMA Model'])])
sma_stats

The Kelly SMA Model doubles the buy and hold approach in terms of total returns. Like the others, it was in a leveraged long position heading into the COVID crash and got crushed. Moreover, from a risk-adjusted return basis, it performs worse than either a basic SMA model or the buy and hold strategy.

Trading with the Kelly Criterion

Leverage can be a powerful and dangerous tool by amplifying both gains and losses. Each of the strategies we ran without a cap on the leverage ratio blew up at some point, highlighting the dangers of such an approach. Most traders who do use the Kelly Criterion in their position sizing only trade half or quarter Kelly, i.e. with 50% or 25% of the Kelly factor size. This is to control risk and avoid blowing up, which is a fate much worse than underperforming the market for a few years.

It’s important to note that the Kelly Criterion is not predictive.

While it may optimize the long-run growth of your returns, we see that it falls down time and time again when unconstrained because it is looking backwards over a small sample size. To get the most out of it, we’d need to use it in a constrained setting on a diversified strategy over many markets. That way, we’d be developing our stats based on the performance of the strategy, giving us a larger sample size and better estimate while also limiting our leverage. We’ll look at strategies like this in future posts.

Interested in more? We're building an algorithmic trading platform that you can use to test your strategies and ideas. You can try our free demo here!