Monte Carlo Simulation of Stock Returns using Python

Monte Carlo Simulation of Stock Returns using Python

While it’s impossible to predict the future, investors all over the world are nevertheless constantly working on new and ever more sophisticated models to make predictions about the future movements of stocks or other assets. Today, we’ll take a look at one of the best-known models used to make predictions about what a given stock price could be in the future – the Monte Carlo simulation (MCS).

What is a Monte carlo simulation?

Yes, you’ve guessed it, the name “Monte Carlo” does indeed refer to the (in)famous capital of Europe’s take on Las Vegas, Monaco, where around the clock, thousands of people try their luck in one of the many casinos of the city, engaging in random trials (e.g. throwing a dice or drawing from a deck of cards). Just like in these casinos, the Monte Carlo simulation runs a given number of random trials of a predetermined experiment in order to give us an idea of the behaviour of our experiment if performed over and over again.

How does this relate to stock prices? In most of the main-stream financial literature, stock returns are assumed to be normally distributed*. Therefore, if we keep drawing random numbers from a correctly specified normal distribution and treat each of these numbers as the return of the stock for a given period, we can aggregate all trials to arrive at a prediction for the future stock price. However, if you did this only once, you would quickly realise that your prediction turns out to be pretty horrible in most cases. That shouldn’t be a surprise. Just like throwing a fair dice 3 times might quite well give you three sixes in a row, if you threw the dice 1000 times, you’ll see that all 6 outcomes are uniformly distributed. Thanks, Law of Large Numbers! Hence, we’ll repeat the process of drawing and aggregating numbers many times over and eventually we’ll get a pretty decent picture of the range, in which the stock price should end up.

Set up

Basically, our code will be structured into three parts.

First, we need to get our data. I’ll be using the pandas_datareader package with the DataReader function in order to get the stock data from Yahoo Finance. There are many other ways to do it, but since this is only tangential to the simulation, I won’t put much focus on it. Second, and probably most fun, we’ll implement the MCS. In the end, we’ll put the numbers into some nice graphs to visually consider the results of our simulation.

Let’s get to it.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
import pandas_datareader

First of all, we’ll write a function to get the stock price from Yahoo Finance. The function receives as input the ticker (a unique identifier for the stock, which can be found on Yahoo Finance. For example Apple has the ticker “AAPL”), a start date and an end date.

def get_data_yahoo(ticker, start, end):
    data_source = "yahoo"
    df = pandas_datareader.DataReader(ticker, data_source, start=start, end=end)["Adj Close"] = ticker
    return df

Here comes the fun bit… The next function will be where the dataframe with all our simulation data is created. It receives the data frame with the price data from the function get_data_yahoo() as well as the ticker name, the number of simulations and the number of predicted days. Note here that we need a sufficiently large num_simulations whereas a long time frame, i.e. large predicted_days, makes the model less accurate.

Now follows a fairly large chunk of code. Take a look first, I’ll explain afterwards.

def monte_carlo_sim(df, num_simulations, predicted_days):
    returns = df.pct_change()
    prices = df
    last_price = prices[-1]
    simulation_df = pd.DataFrame()
    daily_std = returns.std()
    daily_mean = returns.mean()
    for i in range(num_simulations):
        count = 0
        price_series = []
        price = last_price * (1 + np.random.normal(daily_mean, daily_std))

        for j in range(predicted_days):
            if count == predicted_days:
            price = price_series[count] * (1 + np.random.normal(daily_mean, daily_std))
            count += 1
        simulation_df[i] = price_series

    return simulation_df

First of all we create the series containing all the daily stock returns of the stock price and a last_price, which is the most recent stock price. 

We then enter two for loops, the first of which takes care of each individual simulation. For each simulation we require a new price_series list, which we will iteratively fill with our prediction for all subsequent time periods.

Take a closer look at the variable price. The value we assign to it is the most recent stock price (last_price) multiplied with a random number drawn from the normal distribution with the parameters that fit the historic price movements of our stock. This new stock price for period t+1 is appended to our price_series.

Enter the second loop (which could be easily substituted by a for loop) which will loop through prices until our set number of predicted_days is reached. And now, we basically do the same as in the first loop: Take the most recent price prediction (in the 1st iteration this is the prediction for t+1 from the first loop) and multiply it with the random number from the same normal distribution as before.  Append it to the price list. By the time we hit our desired number of days, price_series will include a price predictions for each day. When we leave the second loop, price_series will be appended as a new row to the simulation_df.

We then jump to the next iteration of the loop and repeat the whole process until we have the required number of simulations and with that the complete simulation_df, where each row is a simulation and each column represents a day with all its simulated prices.

Wasn’t that hard, was it?

Now, we have everything we need for plotting the simulation.

def plot_graph(df, ticker, num_simulations):
    df = df[df.columns[df.iloc[-1, :].argsort()]]
    qs = [math.ceil(num_simulations * 0.95), math.ceil(num_simulations * 0.05), math.ceil(num_simulations * 0.5)]
    plt.plot(df, linewidth=0.3)
    labels = [95, 5, 50]
    for q, l in zip(qs, labels):
        v = df.iloc[:, q]
        plt.plot(v, linewidth=3, label="{} % - Quantil: {}".format(l, np.round(v.iloc[-1], 2)))
    plt.legend(loc="upper left")
    plt.title("Monte Carlo Simulation of {} price".format(ticker))

To the plotting function we pass the simulation data frame, the ticker for labelling and the number of simulations. When plotting the simulations, it would be great if we could label the 5% and 95% quantiles. This is common practice in finance since this way we could see what the lowest stock price is that we’ll only cut with a probability of 5%. Google “Value at Risk” if you want to learn more about that. To be able to do that, we will need to order the simulation prices by the last predicted price. 

Finally, we define the list qs which contains the indices of the simulations yielding prices at the 5% and 95% quantile as well as at the median.

The rest of the function is a pretty straightforward matplotlib plot. 
Here’s the full code where I simply added some structure so we can run it. I also plotted the simulation for the FAANG-companies (Facebook, Apple, Amazon, Netflix and Google) for demonstration.

That’s it. Fun, wasn’t it? I will make sure to cover more stock price simulations in the future. See you next time!

Leopold Pfeiffer

Leave a Reply

Your email address will not be published. Required fields are marked *