Prototyping Trading Strategies with Python¶

Getting Started with Python for Algorithmic Trading¶

February, 2017

About me¶

History¶

Programming since forever
Operating in Online / AdTech space since 1997
Trading since 2013
Switched to automated trading in 2014

Co-Founder of Native Alpha¶

Trading mostly Futures and Options using automated and semi-automated strategies, with a 100% technical / quantitative approach

Developer of ezIBpy¶

A Pythonic wrapper for the IbPy library, that ease the communication with Interactive Brokers for market data and order execution.

Available via github.com/ranaroussi/ezibpy

Developer of QTPyLib¶

A simple, event-driven, algorithmic trading system written in Python, that supports backtesting and live trading using Interactive Brokers for market data and order execution (QTPyLib stands for: Quantitative Trading Python Library).

Available via github.com/ranaroussi/qtpylib

Agenda¶

Why Pyhton for Algorithmic Trading?
The Python Toolbox: Platforms and Libraries for Trading
Data Wrangling and Working with Market Data
Using Technical Indicators with Python
Coming Up with Trading Ideas
Writing and Testing your first Trading Strategy
Strategy Development Workflow

* This is the first webinar of a multi webinar series

Why Python?¶

Python's Benifits¶

A very high-level language with clean and readable syntax
Fast learning curve
Object-oriented, multi-paradigm
A rich standard library of modules
One of the most used languages in IoT, data science, finance, and web development
A strong open-source community

"Hello World" in Python¶

In [ ]:

print("hello world")

Compared to C#...¶

In [ ]:

public class Hello
{
   public static void Main()
   {
      System.Console.WriteLine("Hello, World!");
   }
}

Why Python for Trading?¶

Python for Algorithmic Trading¶

Python's syntax is attractive for science, and particularly finance
Fast learning curve
Solid Machine Learning libraries
Quick protopyting, fast enough execution
Many Algorithmic Trading Libraries and Tools (Zipline, PyAlgoTrade, PyBacktest, QTPyLib)
Pandas!

The Python Toolbox¶

Essential Libraries¶

Pandas - times series and tabular data wrangler
NumPy - fast (vectorized) array operations
Matplotlib - plotting library
Jupyter Notebook - research platform

Basically - just install Anaconda :-)

Recommended Libraries¶

TA-Lib - technical indicators
scikit-learn - machine learning algorithms
Tensorflow - deep learning / neural network
SciPy - scientific functions
statsmodels - statistical functions
SQLAlchemy - relational database abstraction library
TsTables - tick data storage and retrieval
ezIBpy - Interactive Brokers bridge
OandaPy - Oanda bridge

Python 2 or Python 3?¶

Python 3!¶

Data Wrangling and Working with Market Data¶

Pandas - the times series and tabular data wrangler¶

Pandas is one of the best libraries (in any language, IMO) for data exploration and manipulation
Main objects are DataFrame and Series
Has many available built-in operations

In [3]:

import pandas as pd
import numpy as np
from pandas_datareader import data

spy = data.get_data_yahoo("SPY", start="2000-01-01")
spy.head()

Out[3]:

	Open	High	Low	Close	Volume	Adj Close
Date
2000-01-03	148.250000	148.250000	143.875000	145.4375	8164300	105.825332
2000-01-04	143.531204	144.062500	139.640594	139.7500	8089800	101.686912
2000-01-05	139.937500	141.531204	137.250000	140.0000	12177900	101.868820
2000-01-06	139.625000	141.500000	137.750000	137.7500	6227200	100.231643
2000-01-07	140.312500	145.750000	140.062500	145.7500	8066500	106.052718

In [4]:

ratio = spy["Close"] / spy["Adj Close"]

spy["close"]  = spy["Adj Close"]
spy["open"]   = spy["Open"] / ratio
spy["high"]   = spy["High"] / ratio
spy["low"]    = spy["Low"] / ratio
spy["volume"] = spy["Volume"]

spy = spy[['open','high','low','close','volume']]
spy.head()

Out[4]:

	open	high	low	close	volume
Date
2000-01-03	107.871804	107.871804	104.688403	105.825332	8164300
2000-01-04	104.438246	104.824835	101.607304	101.686912	8089800
2000-01-05	101.823343	102.982977	99.867825	101.868820	12177900
2000-01-06	101.595958	102.960272	100.231643	100.231643	6227200
2000-01-07	102.096206	106.052718	101.914297	106.052718	8066500

In [5]:

spy.to_csv('~/Desktop/sp500_ohlc.csv')

In [6]:

spy.shape

Out[6]:

(4312, 5)

In [7]:

# calculate returns
spy['return'] = spy['close'].pct_change()
spy['return'].describe()

Out[7]:

count    4311.000000
mean        0.000264
std         0.012463
min        -0.098448
25%        -0.005092
50%         0.000652
75%         0.005935
max         0.145198
Name: return, dtype: float64

In [8]:

# slicing
spy[5:10][["close", "return"]]

Out[8]:

	close	return
Date
2000-01-10	106.416535	0.003431
2000-01-11	105.143175	-0.011966
2000-01-12	104.097201	-0.009948
2000-01-13	105.506992	0.013543
2000-01-14	106.939489	0.013577

In [9]:

spy[ spy['return'] > 0.005 ]['return'].describe()

Out[9]:

count    1244.000000
mean        0.012878
std         0.009962
min         0.005002
25%         0.006912
50%         0.010097
75%         0.014883
max         0.145198
Name: return, dtype: float64

In [10]:

spy['close'].plot()

Out[10]:

<matplotlib.axes._subplots.AxesSubplot at 0x110e1d9b0>

In [11]:

spy['return'].plot.hist(bins=100, edgecolor='white')

Out[11]:

<matplotlib.axes._subplots.AxesSubplot at 0x112b8a780>

In [12]:

# resampling
spy['return'].resample("1A").sum() * 100

Out[12]:

Date
2000-12-31    -6.429918
2001-12-31   -10.111384
2002-12-31   -20.833354
2003-12-31    26.198963
2004-12-31    10.784570
2005-12-31     5.246394
2006-12-31    15.209572
2007-12-31     6.277696
2008-12-31   -37.358504
2009-12-31    26.929435
2010-12-31    15.631094
2011-12-31     4.527766
2012-12-31    15.638902
2013-12-31    28.622693
2014-12-31    13.265248
2015-12-31     2.414655
2016-12-31    12.184697
2017-12-31     5.575270
Freq: A-DEC, Name: return, dtype: float64

Using Technical Indicators with Python¶

Technical Indicators with Python¶

For basic indicators use Pandas methods
Write your own indicators in Python
Use TA-Lib

In [35]:

# pandas' rolling mean
spy['ma1'] = spy['close'].rolling(window=50).mean()
spy['ma2'] = spy['close'].rolling(window=200).mean()

In [13]:

spy[['ma1', 'ma2', 'close']].plot(linewidth=1)

Out[13]:

<matplotlib.axes._subplots.AxesSubplot at 0x1192ebfd0>

In [14]:

# rolling standard deviation
spy['return'].rolling(window=20).std().plot()

Out[14]:

<matplotlib.axes._subplots.AxesSubplot at 0x119a234a8>

In [15]:

# Calculating log returns and volatility
spy['logret'] = np.log(spy['close'] / spy['close'].shift(1))
spy['volatility'] = spy['logret'].rolling(window=252).std() * np.sqrt(252)

spy[['close', 'volatility']].plot(subplots=True)

Out[15]:

array([<matplotlib.axes._subplots.AxesSubplot object at 0x11a0c9518>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x11a0ff198>], dtype=object)

In [17]:

# pure python bollinger bands
spy['sma'] = spy['close'].rolling(window=20).mean()
spy['std'] = spy['close'].rolling(window=20).std()

spy['upperbb'] = spy['sma'] + (spy['std'] * 2)
spy['lowerbb'] = spy['sma'] - (spy['std'] * 2)

plot_candlestick(spy[-100:])
plt.plot(spy[-100:][['upperbb', 'sma', 'lowerbb']], linewidth=1)

Out[17]:

[<matplotlib.lines.Line2D at 0x11a0b7048>,
 <matplotlib.lines.Line2D at 0x11b012898>,
 <matplotlib.lines.Line2D at 0x11b012ba8>]

In [36]:

# using TA-Lib..
import talib as ta

spy['rsi'] = ta.RSI(spy['close'].values, timeperiod=2)
spy[-100:][['close', 'rsi']].plot(subplots=True)

Out[36]:

array([<matplotlib.axes._subplots.AxesSubplot object at 0x11d499d68>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x11d46c320>], dtype=object)

In [19]:

# having some fun :)

data = spy[-100:]

fig = plt.figure()
gs = gridspec.GridSpec(2, 1, height_ratios=[4,1])

ax0 = plt.subplot(gs[0])
plt.plot(data['close'])
ax0.set_ylabel('close')
ax0.fill_between(data.index, data['close'].min(), data['close'], alpha=.25)

ax1 = plt.subplot(gs[1], sharex=ax0)
ax1.set_ylabel('RSI 2')
ax1.plot(data['rsi'], color="navy", linewidth=1)
ax1.axhline(90, color='r', linewidth=1.5)
ax1.axhline(10, color='g', linewidth=1.5)

Out[19]:

<matplotlib.lines.Line2D at 0x11c0903c8>

Coming Up with Trading Ideas¶

Trading Ideas Sources¶

Books
Podcasts
Forums
Workshops and Meetups
SSRN (Social Science Research Network)
Observations
Bugs and miscalculations 😁

Strategy Development Workflow¶

My Workflow¶

Quick proof of concept of a trading idea
Use a vectoried backtest first for fast testing and dumping ideas
Move to an event-based backtest for trading ideas that shows potential
Scrutinize and Optimize
Forward test
Live Trading

Vectorized vs. Event Based Backtesting¶

Vectorized Backtesting¶

Pros:¶

Very fast compared to alternative
Great for quickly testing ideas with minimal coding

Cons:¶

Danger of look-ahead bias
Works of pre-existing, pre-loaded data
Code usually cannot be used "as-is" for live trading

Event-Based Backtesting¶

Pros:¶

More reliable backtesting (no look-ahead bias)
Can handle live data
Code can be used "as-is" for live trading

Cons:¶

Takes longer to write and backtest
Not the best choise for prototyping IMO

Basic Strategy Example¶

BUY ON CLOSE when SPY drops 0.5% (or more)
SELL ON CLOSE of next day

Vectorized Prototyping¶

In [20]:

portfolio = pd.DataFrame(data={ 'spy': spy['return'] })
portfolio['strategy'] = portfolio[portfolio['spy'].shift(1) <= -0.005]['spy']

In [21]:

portfolio.fillna(0).cumsum().plot()

Out[21]:

<matplotlib.axes._subplots.AxesSubplot at 0x119ad32b0>

Measure Performance with Sharpe Ratio¶

Sharpe Ratio¶

Measures of risk adjusted return of investment (>1 is good, >2 is great)

Formula:¶


Sharpe(X) = (rX - Rf) / StdDev(X)

Where:¶

X is the investment
rX is the average rate of the return of X
Rf is the best available risk-free security (i.e. T-bills)
StdDev(X) is the standard deviation of rX

Annualized Sharpe Ratio¶

In [22]:

def sharpe(returns, periods=252, riskfree=0):
    returns = returns.dropna()
    return np.sqrt(periods) * (np.mean(returns-riskfree)) / np.std(returns)

Initial (and very basic) metrics¶

In [23]:

# benchmark sharpe
sharpe(portfolio['spy'])

Out[23]:

0.33619610122573501

In [24]:

# strategy sharpe
sharpe(portfolio['strategy'])

Out[24]:

1.3283765902862263

In [25]:

# time in market
len(portfolio['strategy'].dropna()) / len(portfolio)

Out[25]:

0.25185528756957326

EOY Returns¶

In [38]:

eoy = portfolio.resample("A").sum()
eoy['diff'] = eoy['strategy']/eoy['spy']

print( np.round(eoy[['spy', 'strategy', 'diff']] * 100, 2) )

              spy  strategy    diff
Date                               
2000-12-31  -6.43      3.69  -57.44
2001-12-31 -10.11     -5.20   51.39
2002-12-31 -20.83      9.04  -43.40
2003-12-31  26.20     13.54   51.69
2004-12-31  10.78      2.37   22.00
2005-12-31   5.25      9.82  187.18
2006-12-31  15.21      7.22   47.48
2007-12-31   6.28     19.61  312.41
2008-12-31 -37.36     23.44  -62.75
2009-12-31  26.93      8.87   32.96
2010-12-31  15.63     17.44  111.55
2011-12-31   4.53     10.44  230.57
2012-12-31  15.64      5.60   35.82
2013-12-31  28.62      9.61   33.58
2014-12-31  13.27      6.53   49.20
2015-12-31   2.41     -2.61 -108.18
2016-12-31  12.18      8.18   67.12
2017-12-31   5.58     -0.01   -0.16

Enother Example¶

GO LONG where 50-day SMA > 200-day SMA
GO SHORT where 50-day SMA < 200-day SMA

In [41]:

ma_portfolio = spy[['close', 'return']].copy()
ma_portfolio.rename(columns={'return':'spy'}, inplace=True)

In [42]:

ma_portfolio['ma1'] = ma_portfolio['close'].rolling(window=50).mean()
ma_portfolio['ma2'] = ma_portfolio['close'].rolling(window=200).mean()

In [43]:

ma_portfolio[['ma1', 'ma2', 'close']].plot(linewidth=1)

Out[43]:

<matplotlib.axes._subplots.AxesSubplot at 0x11da647f0>

In [28]:

ma_portfolio['position'] = np.where(
    ma_portfolio['ma1'].shift(1) > ma_portfolio['ma2'].shift(1), 1, -1)

ma_portfolio['strategy'] = ma_portfolio['position'] * ma_portfolio['spy']

In [29]:

ma_portfolio[['strategy', 'spy']].cumsum().plot()

Out[29]:

<matplotlib.axes._subplots.AxesSubplot at 0x11c13d2b0>

Initial metrics¶

In [30]:

# benchmark sharpe
sharpe(ma_portfolio['spy'])

Out[30]:

0.33619610122573501

In [31]:

# strategy sharpe
sharpe(ma_portfolio['strategy'])

Out[31]:

0.43439861413278519

In [32]:

# time in market
len(ma_portfolio['strategy'].dropna()) / len(ma_portfolio)

Out[32]:

0.9997680890538033

In [40]:

eoy = ma_portfolio.resample("A").sum()
eoy['diff'] = eoy['strategy']/eoy['spy']
print( np.round(eoy[['spy', 'strategy', 'diff']] * 100, 2) )

              spy  strategy    diff
Date                               
2000-12-31  -6.43     13.20 -205.34
2001-12-31 -10.11     10.11 -100.00
2002-12-31 -20.83     17.73  -85.10
2003-12-31  26.20      9.20   35.13
2004-12-31  10.78      6.56   60.84
2005-12-31   5.25      5.25  100.00
2006-12-31  15.21      9.06   59.57
2007-12-31   6.28      7.76  123.58
2008-12-31 -37.36     37.36 -100.00
2009-12-31  26.93     15.67   58.19
2010-12-31  15.63     -7.32  -46.82
2011-12-31   4.53    -10.19 -225.06
2012-12-31  15.64      6.07   38.84
2013-12-31  28.62     28.62  100.00
2014-12-31  13.27     13.27  100.00
2015-12-31   2.41     -9.04 -374.21
2016-12-31  12.18    -11.91  -97.75
2017-12-31   5.58      5.58  100.00

Prototyping Trading Strategies with Python¶

Getting Started with Python for Algorithmic Trading¶

About me¶

History¶

Co-Founder of Native Alpha¶

Developer of ezIBpy¶

Developer of QTPyLib¶

Agenda¶

Why Python?¶

Python's Benifits¶

"Hello World" in Python¶

Compared to C#...¶

Why Python for Trading?¶

Python for Algorithmic Trading¶

The Python Toolbox¶

Essential Libraries¶

Recommended Libraries¶

Python 2 or Python 3?¶

Python 3!¶

Data Wrangling and Working with Market Data¶

Pandas - the times series and tabular data wrangler¶

Using Technical Indicators with Python¶

Technical Indicators with Python¶

Coming Up with Trading Ideas¶

Trading Ideas Sources¶

Strategy Development Workflow¶

My Workflow¶

Vectorized vs. Event Based Backtesting¶

Vectorized Backtesting¶

Pros:¶

Cons:¶

Event-Based Backtesting¶

Pros:¶

Cons:¶

Basic Strategy Example¶

Vectorized Prototyping¶

Measure Performance with Sharpe Ratio¶

Sharpe Ratio¶

Formula:¶

Where:¶

Annualized Sharpe Ratio¶

Initial (and very basic) metrics¶

EOY Returns¶

Enother Example¶

Initial metrics¶

To Be Continued...¶

Up Next...¶

Reading list¶

Python Books¶

Trading Books¶

Q&A Time¶

Prototyping Trading Strategies with Python¶

Thank you for attending!¶