A demon comes to you one night giving you a simple dice game to play. You're offered the chance to wager your wealth and receive a 50% increase if you roll a 6, 5% bump if you roll 2-5, or lose 50% if you roll a 1. You also get 300 rolls and get to compound your wealth with each roll of the die.

Do you play?

Most people would jump at this opportunity. They'd look at the average of these payoffs and think they have a 3.3% edge. If they do the math on that, they'd realize that the expected value comes out to the easiest 1,700,000% (1.033^300) return on their wealth they'll ever see!

This is a demon though, so what's the catch?

In fact, talk to any experienced gambler, card player, or trader, and they'd realize that even with that edge, betting the house on each roll of the die is a fool's errand.

## TL;DR

This is the most important chapter in Mark Spitznagels' book, Safe Haven: Investing for Financial Storms. It's not because he shares the "secret sauce" to his tail hedging fund (spoiler alert: he doesn't) but he does give you the tools about how to approach risk mitigation through the example of a dice game and three ways we can play it. We walk through each of the examples with code, plus add a few different wrinkles not found in the book, to help clarify how Spitznagel considers risk mitigation in his portfolio. If you don't want to code, but just want to learn how to trade, you can check out the free demo of our no-code algorithmic trading platform here.

Just like the average family with 2.3 kids doesn't exist, your average return doesn't either. In fact, you only get this 3.3% return in a special case that Mark Spitznagel terms playing with Schrödinger's Demon.

In physicist Erwin Schrödinger's famous thought experiment about quantum entangled cat, the cat is in a superposition rendering it both alive and dead at the same time. In the dice game, looking at the arithmetic average is like assuming every roll gives you the payoff from all the outcomes simultaneously. So yes, the average has a great payoff, but only in this multiverse world where you get all of the results at once does it actually mean anything to the individual dice player.

In reality, you get to play with what Spitznagel terms Nietzsche's Demon - you get one chance at life and this dice game, so you better make it count!

(I'm not going to go into why he uses Nietzsche, Schrödinger, and demons - we'll be plagiarizing enough of the book in this blog post - so go ahead and get the book here for the humorous and entertaining vignette's that motivate the math; it's worth the read!)

This requires a different mathematical technique to see if you should play.

## Simulating the Roll of a Die

Because I'm slow (or skeptical) I didn't really "get" the power of Spitznagel's point until I recreated the examples myself. The examples aren't complicated - they're deceptively simple - which makes them great tools to learn from. Moreover, we can run our own Monte Carlo simulation in Python in just a few lines of code! Of course, you don't have to run the code yourself (although doing so will allow you to ask new questions), because we provide detailed explanations all along the way.

Enough blabber.

Let's import some basic packages and write a function to simulate our game with Nietzsche's Demon of the dice.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

np.random.seed(1234)

A quick side note - it's always recommended to set your random seed when doing any kind of stochastic simulation like this so that the results are reproducible. I don't always follow my own advice, but I did remember to insert that simple line above so that you can run these exact experiments for yourself.

With that, we've got a 6-sided die with the -50%, 50%, and 5% payoffs we stated above. We'll run 10,000 simulations of 300 rolls each. That's 3 million simulated rolls, which can easily be handled using NumPy and some vectorized functions for some clean, concise code.

def NietszcheDice(cash: float=0,
returns: list=[0.5, 1.05, 1.05, 1.05, 1.05, 1.5],
rolls: int=300, samples: int=10000):
bet = 1 - cash
adj_returns = cash + bet * np.asarray(returns)
size=(rolls, samples)).reshape(-1, rolls)
return roll_sims.cumprod(axis=1)

You can pass a different list of returns, increase or decrease the number of sampled simulations or the number of rolls. There's also a bit of foreshadowing going on here if you look at the code because we have another argument called cash; we'll explain this shortly, but for now it's set to 0 and doesn't affect anything.

We can get our 10,000 simulated trajectories of our wealth with:

n_traj = NietszcheDice()

Each trajectory corresponds to a simulated series of 300 rolls. The last column in our trajectory array will be our ending wealth fraction, which we can analyze to see how we fared.

Now, let's write one more helper function to analyze all of this data. We want to see what the relevant quantiles of our trajectories are, so we'll get a quick function to handle that and look at our data.

def getQuantilePath(trajectories: np.array, q: float=0.5):
quantile = np.quantile(trajectories[:, -1], q=q)
path = trajectories[np.abs(quantile - trajectories[:, -1]).argmin()]
return quantile, path

This function takes in our trajectory array and finds the trajectory that most closely corresponds to our ending wealth quantile. So, if we're looking for the median (or the 50th percentile) we pass q=0.5 and we'll get our median value and the particular path it took for easy visualization.

Now, let's plot the results of our simulation and look at the results (I'll be using this exact same code throughout the post, so won't repeat it for each plot).

colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

perc50, path50 = getQuantilePath(n_traj)
perc95, path95 = getQuantilePath(n_traj, q=0.95)
perc5, path5 = getQuantilePath(n_traj, q=0.05)
path_avg = n_traj.mean(axis=0)

fig = plt.figure(figsize=(15, 8))
gs = fig.add_gridspec(1, 2, width_ratios=(3, 1))

ax.plot(path50, label='Median')
ax.plot(path95, label=r'$95^{th}$ Percentile')
ax.plot(path5, label=r'$5^{th}$ Percentile')
ax.plot(path_avg, label='Mean', linestyle=':')
ax.fill_between(np.arange(n_traj.shape),
y1=n_traj.min(axis=0),
y2=n_traj.max(axis=0),
alpha=0.3, color=colors)
ax.set_xlabel('Rolls')
ax.set_ylabel('Ending Wealth')
ax.semilogy()
ax.legend(loc=3)

growth = (np.power(n_traj[:, -1], 1/300) - 1) * 100
growth_med = (np.power(path50[-1], 1/300) -1) * 100
growth_avg = (np.power(path_avg[-1], 1/300) - 1) * 100
ax_hist.hist(growth, orientation='horizontal', bins=50,
color=colors, alpha=0.3)
ax_hist.axhline(0, label='Break Even', color='k', linestyle=':')
ax_hist.axhline(growth_med, label='Median', color=colors)
ax_hist.axhline(growth_avg, label='Mean', color=colors)
ax_hist.set_ylabel('Compound Growth Rate (%)')
ax_hist.set_xlabel('Frequency')
ax_hist.legend()

plt.tight_layout()
plt.show() The game with the 3.3% edge doesn't look so good if you imagine playing it 10,000 times! The average case (the dotted, orange line) increases steadily to give you that great payoff, but, if you look to the histogram on the right, you'll see that very few people are so lucky as to be "average." In fact, only 0.9% of outcomes are average or better.

This is because the returns are heavily skewed, so your average is much higher than the median - where you're much more likely to end up. In fact, looking at the median, we see we end up with a 99% percent loss.

That's right, betting everything with a fat, enviable 3.3% edge is going to drive us to end up with 1% of our wealth in most cases.

Of course, if you read our previous post, you'd know that you could just take a shortcut and calculate all of this with the geometric average to get the median returns and our compound growth rate.

print(f"Median Outcome: {(np.prod([i**50 for i in returns]) - 1)*100:.2f}%")
x = (np.power(np.prod([i**50 for i in returns]), 1/300) - 1) * 100
print(f"Compound Growth Rate of Median: {x:.2f}%")
Median Outcome: -99.02%
Compound Growth Rate of Median: -1.53%

What if we could change the rules just a smidge? How about instead of betting all of our wealth on each roll of the dice, we bet 40% of our wealth and held 60% back in cash? What would that do?

## Positioning to Win

Let's use that extra argument in our NietzscheDice() function to keep 60% of our wealth in cash and re-run the experiment above.

# With cash in reserve
cash = 0.6
cash_traj = NietszcheDice(cash) This 60/40 portfolio of sorts greatly increases our returns!

This may seem counter intuitive, but we've reduced our arithmetic average from 3.3% down to 1.3%, but we've boosted our geometric mean up to 0.64%. Over 300 rolls, this translates into a median outcome of 582% increase in your overall wealth.

Nothing was done here apart from reducing our bet size. This should highlight the critical importance of risk management and position sizing when trading - take on too much risk and you're likely to blow up your account like we saw in the first example.

We reduced the mean, increased the median, but also increased our odds of making money. In the first example, 79% of our trajectories wound up losing money and only 1% were above average. In this case, only 18% lost money and 16% were above the average.

We have also reduced our variance. Our 5th and 95th percentile cases are much closer to one another in this latter case. Of course, the extremely high wealth that we had in the all-in case (achieved by the extremely lucky) has been reduced, this strikes me as a worthwhile trade-off.

### The Optimal Bet Size

So we see that reducing our bet size to 40% boosts our returns. But is this the best we can do?

To test this, we can run our NietzscheDice Monte Carlo simulation for any bet size we want and compare the results. We'll generate the data and see what maximizes our compound growth rate.

The code to run this is given below. Note that I'm running each fraction 10 times and averaging the median values out just to smooth the curves a bit. If you have a slow computer, this may take a few minutes to run.

# Optimal tradeoff
cash_frac = np.linspace(0, 1, 101)[::-1]
N = 10 # Multiple runs to smooth out the values
vals5 = np.zeros((len(cash_frac), N))
vals50 = vals5.copy()
vals95 = vals5.copy()
for i in range(N):
for j, f in enumerate(cash_frac):
traj = NietszcheDice(f)
perc5, _ = getQuantilePath(traj, 0.05)
perc50, _ = getQuantilePath(traj, 0.5)
perc95, _ = getQuantilePath(traj, 0.95)
vals5[j, i] += perc5
vals50[j, i] += perc50
vals95[j, i] += perc95

vals5_smooth = vals5.mean(axis=1)
vals50_smooth = vals50.mean(axis=1)
vals95_smooth = vals95.mean(axis=1)

# Plot the results
plt.figure(figsize=(12, 8))
plt.plot(vals5_smooth, label=r'$5^{th}$ Percentile')
plt.plot(vals50_smooth, label=r'$50^{th}$ Percentile')
plt.plot(vals95_smooth, label=r'$95^{th}$ Percentile')
plt.scatter(vals5_smooth.argmax(), vals5_smooth.max(),
marker='*', s=200)
plt.scatter(vals50_smooth.argmax(), vals50_smooth.max(),
marker='*', s=200)
plt.scatter(vals95_smooth.argmax(), vals95_smooth.max(),
marker='*', s=200)
plt.xlabel('Percentage of Wealth Wagered')
plt.ylabel('Ending Wealth')
plt.title('Optimal Bet Size')
plt.semilogy()
plt.legend()
plt.show() It turns out that the 40% wager we chose is roughly optimal (so Spitznagel set it up this way) as shown by the star on the 50th percentile in the plot above. Note that this is the bet that maximizes the 50th percentile, but, if we were more risk averse, we could try to optimize for the 5th percentile and go into 94% cash while only betting 6% of our wealth (that reduces our expectation down to 5.8% total growth in our wealth from the 582% increase we saw with maximizing the median, but you get the idea).

Note too that if you increase much beyond that 40% level, you start to get a reduction in your median wealth, and the drop off gets steeper the more off base you are.

Maximizing the median like this can be calculated using the Kelly Criterion, which provides a precise formula for calculating our position size. We're not going to go into the derivation or anything here, but for the curious, we can calculate the Kelly Criterion for multiple, discreet outcomes as:

$$\max_x g^* = \prod_{i=1}^N (1 + w_i x)^{p_i}$$

This means we're trying to choose x - which is our bet size - in order to maximize g*. In this case, w_i represents the winnings from each of our outcome (e.g. -50%, 5%, and 50%) and p_i is the probability of each outcome (we've got dice, so 1/6 for each).

The easiest way to find the bet size that maximizes our returns is by trying every fraction to find out where g* is highest.

If we do that, we can get the following plot:

def discreteKellyCriterion(x: float, returns: list, probs: list):
return np.prod([(1 + b * x)**p for b, p in zip(returns, probs)]) - 1

probs = np.repeat(1/6, 6)
returns = [-0.5, 0.05, 0.05, 0.05, 0.05, 0.5]
g = np.array([discreteKellyCriterion(f, returns, probs)
for f in cash_frac])
g *= 100

plt.figure(figsize=(12, 8))
plt.plot(cash_frac, g)
plt.xlabel('Fraction Bet')
plt.ylabel('Compound Growth Rate (%)')
plt.title('Optimal Bet Size According to the Kelly Criterion')
plt.show() The maximum growth rate for both is very close in this theoretical case and for our simulated results.

# Simulated and theoretical are very close
g_sim = (np.power(vals50_smooth, 1/300) - 1) * 100
print(f"Simulated Max Growth Rate: {g_sim.max():.2f}")
print(f"Theoretical Max Growth Rate: {g.max():.2f}")
Simulated Max Growth Rate: 0.65
Theoretical Max Growth Rate: 0.64

And the theoretical value comes out to a 62/38 cash/bet split.

Spitznagel doesn't stop with the power of the Kelly Criterion though, he continues by offering a tantalizing example of the possibilities if you are able to insure yourself against a loss.

## Playing Dice with Insurance

In our previous post, we showed how paying for insurance can increase the returns for our 17th century merchant who ran the risk of pirates. The same idea can be applied to our dice game as well as your portfolio.

Spitznagel asks us to play the same dice game, except we can insure against our losses of rolling that 1. If we get a 1, we get a 500% return on our allocation, otherwise we lose our premium every time. If we allocate 9% of our capital to insurance on every roll and wager 91% of our capital, we see our arithmetic average drop to 3% vs the 3.3% edge we saw with the original game.

# Playing Dice with Insurance
f = 0.09
insurance = np.array([6, 0, 0, 0, 0, 0])
returns = np.array([0.5, 1.05, 1.05, 1.05, 1.05, 1.5])

ins_rets = f * insurance + (1 - f) * returns
print(f'Mean Returns with Insurance {(ins_rets.mean() - 1) * 100:.1f}%')
Mean Returns with Insurance 3.0%

You probably see where this is going though, so I'll cut to the chase, our geometric mean (the one that does all of that valuable compounding) rises to 2.1% with this trade-off.

ins_gm = (np.power(np.prod(np.power(ins_rets, 50)), 1/300) - 1) * 100
print(f'Geometric Mean with Insurance {ins_gm:.1f}%')
Geometric Mean with Insurance 2.1%

To simulate this, so we'll have to modify our NietzscheDice() function slightly to accommodate both return profiles.

def NietszcheDiceIns(ins_frac: float=0,
dice_returns: list=[0.5, 1.05, 1.05, 1.05, 1.05, 1.5],
ins_returns: list=[6, 0, 0, 0, 0, 0],
rolls: int=300, samples: int=10000):
bet = 1 - ins_frac
adj_returns = f * np.asarray(ins_returns) + bet * np.asarray(returns)
size=(rolls, samples)).reshape(-1, rolls)
return roll_sims.cumprod(axis=1)

Now, we can simulate our 10,000 trajectories and see how this model performs.

# With insurance
ins_frac = 0.09
ins_traj = NietszcheDiceIns(ins_frac) Adding this dice insurance does remarkable things for our payoff profile!

We've eliminated those 50% losses entirely at the cost of reducing our winning payoffs by 9%. It certainly seems like a worthy trade-off because we've drastically cut down on our probability of losing money (0.2% finished in the red) and boosted our compound growth rate. What is most striking is that we went from losing money on 1/6 rolls to losing money on 5/6 rolls with the insurance, yet improved our payoff.

This is precisely the kind of asset Spitznagel is referring to when he writes about safe havens - we have mitigated our risk in a cost-effective manner because we raised our CAGR by adding this to our portfolio.

As we did for the others, let's look at the optimality curve for the insurance investment. Buying too much insurance (moving farther left on the plot) is disastrous for your portfolio. The payoff is too small and too rare to make up for it and it quickly plummets into a massive abyss you're not going to get yourself out of.

Let's zoom in on those optimal sizes. What's nice about the insurance case is that you don't need much of it to provide big returns. As Spitznagel puts it, insurance is "like a pinch of salt - just a pinch becomes the most important ingredient to the dish, whereas more than a pinch ruins it."

You can see that impact in the plot.

The optimal point is much more narrow here, thus unforgiving, so don't over do it!

Not only is the amount important, but what you pay is important too.

With the 500% return, the amount you get is exactly what you're expected to pay into the insurance via your premium. For the insurance company, the arithmetic average is what's important - they can spread their bets across multiple realizations (like you could with Schrödinger's Demon). Due to competition, you'd expect their edge to be low, a fraction of a percent, but this example sets it at 0 - essentially, it's perfectly priced and you're not likely to get insurance that cheap to support your gambling.

So, what do these payoffs look like if we reduce our insurance payout and how much should we pay for insurance before we fall off of our compounding wealth expectations cliff?

We can investigate this by computing our returns for each fraction of our portfolio that we allocate to insurance at different returns and computing the geometric mean. We'll calculate all of these, then plot it and take a look at the results.

# What to pay for insurance?
plt.figure(figsize=(12, 8))
fracs = np.linspace(0.8, 1, 21)
for i in reversed(range(3, 7)):
growth = []
ins_rets = np.array([i, 0, 0, 0, 0, 0])
for f in fracs:
rets = (1 - f) * ins_rets + f * np.asarray(returns)
g = (np.power(np.prod(np.power(rets, 50)), 1/300) - 1) * 100
growth.append(g)

plt.plot(fracs * 100, growth,
label=f'{(i-1)*100}% Insurance Return')
plt.scatter(fracs[np.argmax(growth)] * 100,
max(growth), marker='*', s=200)

plt.ylabel('Growth Rate (%)')
plt.xlabel('Percentage Wagered')
plt.title('Returns for Different Insurance Payoffs')
plt.legend()
plt.show() The best we could hope to do (unless there was systematic mispricing of insurance in the market - which has happened before) is that perfect price where we get a 500% return (the upper red line). We'd expect to be somewhere below that.

If we only get a 400% return every time we hit a 1, then our compound growth rate drops from 2.1% to 0.55% and we need to reduce our allocation to insurance from 9% to 7%.

If insurance gets much more expensive than that, we move into negative expectations, but at least in the case of a 300% payout, we do better with some (5%) insurance allocation than none. Below that, it's better to go without insurance all together than to pay for it.

## Cash, Insurance, and your Optimal Wager

I got curious, so I decided to go one step further and look at the trade -off between our three asset classes and see what gives us the best results in all of these cases.

To do this, I calculated the geometric mean for different combinations of cash and insurance to see if this portfolio would boost returns at all.

Here's the (ugly) code to do this:

# Cash and Insurance
fig, ax = plt.subplots(1, 3, figsize=(20, 8), sharey=True)
for n, j in enumerate(reversed(range(4, 7))):
ins_rets = np.array([j, 0, 0, 0, 0, 0])
cash_frac = np.linspace(0, 1, 101)
ins_frac = np.linspace(0, 0.2, 21)
growth = np.zeros((len(ins_frac), len(cash_frac)))
for i, f in enumerate(ins_frac):
_growth = np.zeros(len(cash_frac))
for k, c in enumerate(cash_frac):
if f + c > 1:
continue
rets = c + f * ins_rets + (1 - c - f) * np.asarray(returns)
g = (np.power(np.prod(np.power(rets, 50)), 1/300) - 1) * 100
_growth[k] += g

growth[i] += _growth

m = np.where(growth==growth.max())
ins_frac[m], cash_frac[m], growth.max()

X, Y = np.meshgrid(ins_frac, cash_frac)
cont = ax[n].contour(X * 100, Y * 100, growth.T, cmap=plt.cm.plasma,
levels=[-1, 0, 0.25, 0.5, 1, 2, 2.5])
ax[n].set_xlabel('Insurance Allocation (%)')
ax[n].set_title(f'Insurance Payoff = {(ins_rets - 1) * 100}%')
# Some ugly code to make sure that the labels don't overlap
_x_loc = 15
if j == 6:
_x_loc = 6
ax[n].annotate(r'$g^* = {:.2f}$%'.format(growth.max()),
xy=(_x_loc, 20), size=10)
ax[n].annotate(f'Cash = {cash_frac[m] * 100:.0f}%',
xy=(_x_loc, 15), size=10)
ax[n].annotate(f'Insur. = {ins_frac[m] * 100:.0f}%',
xy=(_x_loc, 10), size=10)

ax.set_ylabel('Cash Allocation (%)')
cbar = fig.colorbar(cont, ax=ax.ravel().tolist(),
shrink=0.95, drawedges=True)
fig.suptitle('Expected Growth Rate for Cash and Insurance Allocation',
size=16)
plt.show() With cheap insurance, the best bet is just to play the optimal insurance game. If it get's a bit more expensive, then we got to 40% cash plus a pinch of insurance (3%). If the insurance gets too expensive, we drop it all together and just take the Kelly bet we found above.

## We've got Options!

It's common to come across discussions that say "tail risk hedging doesn't pay." From this simple example though, you can see that most of the simplistic models that show those conclusions frequently devote a naive, fixed percentage of the portfolio to option strategies, cash, treasuries, or other safe haven strategies.

For example, in this post, we see that their insurance strategy included a variety of out of the money puts (table below). That's fine, but from the previous examples, we saw that the price you pay for the insurance is crucial, and there's no indication that was taken into consideration. Likely, they had multiple periods where they were overpaying for insurance by simply allocating 10% (the precise amount isn't clear) to out of the money puts. A more robust strategy would be dynamic and shift to cash or treasuries as option prices increase and back to options as they decrease.

In investing, it's never as simple as "tail hedging doesn't work" or any other blanket statement that gets thrown out there makes it seem.

Our favorite to push back on is "retailer traders always lose money."

Yes, a lot do - we don't dispute that - but most aren't reading about proper risk management and systematic investing either!

We believe that with good principles in place, you can make money trading, and that the best way to do it is through a well-tested, algorithmic approach that takes the emotions out of trading decisions.

Traditionally, this has only been open to those with the money to pay hedge funds to manage their money or to those with lots of math and coding skill (and copious amounts of free time) to develop and test strategies themselves. We're building a no-code trading platform to allow average investors to research quantitative strategies and deploy them to trade in the markets on their behalf.

We don't think algorithmic and quantitative trading should be the domain of the few.

If you're interested, check out our free demo and join us as we democratize quantitative investing!