In this tutorial we will:
Note
To learn more about what a bollinger band is, please see this article.
You can get the full source code of the tutorial here
The tutorial is based off of the last homework in QSTK. Since the portfolio is analyzed from the start date, the returned metrics will be different even if you use the same stock universe as the homework.
First you need to initialize the object and setup the stock universe:
prophet = Prophet()
prophet.set_universe(["AAPL", "XOM"])
Then you register any data generators.
# Registering data generators
prophet.register_data_generators(YahooCloseData(),
BollingerData(),
BollingerEventStudy())
Note
Please see the source code of prophet.data for an example of a data generator. Data generators don’t have to just pull raw data though like prophet.data.YahooCloseData does. For instance, you can generate correlation data based off the price data. Prophet encourages you to logically separate out different steps in your analysis.
The name attribute of each of the generators is the key on the data object at which the generated data is stored. This data object is passed into each of the data generators. For example, since the YahooCloseData object has the name “prices”, we can use the price data in the BollingerData that we execute right after.
import pandas as pd
from prophet.data import DataGenerator
class BollingerData(DataGenerator):
name = "bollinger"
def run(self, data, symbols, lookback, **kwargs):
prices = data['prices'].copy()
rolling_std = pd.rolling_std(prices, lookback)
rolling_mean = pd.rolling_mean(prices, lookback)
bollinger_values = (prices - rolling_mean) / (rolling_std)
for s_key in symbols:
prices[s_key] = prices[s_key].fillna(method='ffill')
prices[s_key] = prices[s_key].fillna(method='bfill')
prices[s_key] = prices[s_key].fillna(1.0)
return bollinger_values
See how the BollingerData.run() method uses the price data to generate a rolling standard deviation and rolling mean. The fillna method is used here to fill in missing data. Realistically, only the bfill() method is uses in this example because the first 20 days won’t have 20 prior days of price data to generate the rolling mean and standard deviation.
Note
prices is also passed into the run function of all DataGenerator objects for convenience but we want to emphasize that the data object is where most data from data generators is stored.
The line below normalizes the bollinger data relative to the the rolling standard devation. This gives us the number of standard devations as an integer value. This means a value of 2 would be the upper band and a value of -2 would be the lower band.
bollinger_values = (prices - rolling_mean) / (rolling_std)
At this point we need one more generator. We will call this one BollingerEventStudy. Essentially, all it will do is run through the bollinger data and see if our conditions to issue a buy order are met.
class BollingerEventStudy(DataGenerator):
name = "events"
def run(self, data, symbols, start, end, lookback, **kwargs):
bollinger_data = data['bollinger']
# Add an extra timestamp before close_data.index to be able
# to retrieve the prior day's data for the first day
start_index = bollinger_data.index.get_loc(start) - 1
timestamps = bollinger_data.index[start_index:]
# Find events that occur when the market is up more then 2%
bollinger_spy = bollinger_data['SPX'] >= 1.2 # Series
bollinger_today = bollinger_data.loc[timestamps[1:]] <= -2.0
bollinger_yesterday = bollinger_data.loc[timestamps[:-1]] >= -2.0
# When we look up a date in bollinger_yesterday,
# we want the data from the day before our input
bollinger_yesterday.index = bollinger_today.index
events = (bollinger_today & bollinger_yesterday).mul(
bollinger_spy, axis=0)
return events.fillna(0)
Note
Notice how all the data generators use the pandas library as much as possible instead of python for loops. This is key to keeping your simulations fast. In general, try to keep as much code as possible running in C using libraries like numpy and pandas.
Now we need to create an order generator. One thing we need to do is keep track of sell orders which we want to execute 5 days after the “buy” order. To do that, when we call run the first time, we run the setup() method.
class OrderGenerator(object):
def setup(self, events):
sell_orders = pd.DataFrame(index=events.index, columns=events.columns)
sell_orders = sell_orders.fillna(0)
self.sell_orders = sell_orders
def run(self, prices, timestamp, cash, data, **kwargs):
""" Takes bollinger event data and generates orders """
events = data['events']
if not hasattr(self, 'sell_orders'):
self.setup(events)
Note
The order generator API may change slightly in future version to allow for less hacky setup functions.
The rest of the run() function will find all buy signals from the event study, find all sell orders from the sell orders Dataframe, and create orders from both sources. When creating an buy order, it will also add a sell order to the sell_orders Dataframe.
# def run(...):
# ...
orders = Orders()
# Find buy events for this timestamp
timestamps = prices.index
daily_data = events.loc[timestamp]
order_series = daily_data[daily_data > 0]
# Sell 5 market days after bought
index = timestamps.get_loc(timestamp)
if index + 5 >= len(timestamps):
sell_datetime = timestamps[-1]
else:
sell_datetime = timestamps[index + 5]
symbols = order_series.index
self.sell_orders.loc[sell_datetime, symbols] -= 100
daily_sell_data = self.sell_orders.loc[timestamp]
daily_sell_orders = daily_sell_data[daily_sell_data != 0]
# Buy and sell in increments of 100
for symbol in daily_sell_orders.index:
orders.add_order(symbol, -100)
daily_event_data = events.loc[timestamp]
daily_buy_orders = daily_event_data[daily_event_data != 0]
# Buy and sell in increments of 100
for symbol in daily_buy_orders.index:
orders.add_order(symbol, 100)
return orders
Now we register the order generator and execute the backtest.
prophet.set_order_generator(OrderGenerator())
backtest = prophet.run_backtest(start=dt.datetime(2008, 1, 1),
end=dt.datetime(2009, 12, 31), lookback=20)
The last step is to analyze the portfolio:
prophet.register_portfolio_analyzers(default_analyzers)
analysis = prophet.analyze_backtest(backtest)
print(analysis)
default_analyzers is a list of the four types of analysis we want. Much like the BollingerData generator, the Sharpe ratio analyzer uses the data returned by the volatility and average return analyzers to generate a Sharpe ratio.