Prototyping Trading Strategies with Python


Getting Started with Python for Algorithmic Trading


© Ran Aroussi
@aroussi | aroussi.com | github.com/ranaroussi



February, 2017

About me

History

  • Programming since forever
  • Operating in Online / AdTech space since 1997
  • Trading since 2013
  • Switched to automated trading in 2014

Co-Founder of Native Alpha

Trading mostly Futures and Options using automated and semi-automated strategies, with a 100% technical / quantitative approach

Developer of ezIBpy

A Pythonic wrapper for the IbPy library, that ease the communication with Interactive Brokers for market data and order execution.

Available via github.com/ranaroussi/ezibpy

Developer of QTPyLib

A simple, event-driven, algorithmic trading system written in Python, that supports backtesting and live trading using Interactive Brokers for market data and order execution (QTPyLib stands for: Quantitative Trading Python Library).

Available via github.com/ranaroussi/qtpylib

Agenda

  • Why Pyhton for Algorithmic Trading?
  • The Python Toolbox: Platforms and Libraries for Trading
  • Data Wrangling and Working with Market Data
  • Using Technical Indicators with Python
  • Coming Up with Trading Ideas
  • Writing and Testing your first Trading Strategy
  • Strategy Development Workflow

* This is the first webinar of a multi webinar series

Why Python?

Python's Benifits

  • A very high-level language with clean and readable syntax
  • Fast learning curve
  • Object-oriented, multi-paradigm
  • A rich standard library of modules
  • One of the most used languages in IoT, data science, finance, and web development
  • A strong open-source community

"Hello World" in Python

In [ ]:
print("hello world")

Compared to C#...

In [ ]:
public class Hello
{
   public static void Main()
   {
      System.Console.WriteLine("Hello, World!");
   }
}

Why Python for Trading?

Python for Algorithmic Trading

  • Python's syntax is attractive for science, and particularly finance
  • Fast learning curve
  • Solid Machine Learning libraries
  • Quick protopyting, fast enough execution
  • Many Algorithmic Trading Libraries and Tools (Zipline, PyAlgoTrade, PyBacktest, QTPyLib)
  • Pandas!

The Python Toolbox

Essential Libraries

  • Pandas - times series and tabular data wrangler
  • NumPy - fast (vectorized) array operations
  • Matplotlib - plotting library
  • Jupyter Notebook - research platform

Basically - just install Anaconda :-)

  • TA-Lib - technical indicators
  • scikit-learn - machine learning algorithms
  • Tensorflow - deep learning / neural network
  • SciPy - scientific functions
  • statsmodels - statistical functions
  • SQLAlchemy - relational database abstraction library
  • TsTables - tick data storage and retrieval
  • ezIBpy - Interactive Brokers bridge
  • OandaPy - Oanda bridge

Python 2 or Python 3?

Python 3!

Data Wrangling and Working with Market Data

Pandas - the times series and tabular data wrangler

  • Pandas is one of the best libraries (in any language, IMO) for data exploration and manipulation
  • Main objects are DataFrame and Series
  • Has many available built-in operations
In [3]:
import pandas as pd
import numpy as np
from pandas_datareader import data

spy = data.get_data_yahoo("SPY", start="2000-01-01")
spy.head()
Out[3]:
Open High Low Close Volume Adj Close
Date
2000-01-03 148.250000 148.250000 143.875000 145.4375 8164300 105.825332
2000-01-04 143.531204 144.062500 139.640594 139.7500 8089800 101.686912
2000-01-05 139.937500 141.531204 137.250000 140.0000 12177900 101.868820
2000-01-06 139.625000 141.500000 137.750000 137.7500 6227200 100.231643
2000-01-07 140.312500 145.750000 140.062500 145.7500 8066500 106.052718
In [4]:
ratio = spy["Close"] / spy["Adj Close"]

spy["close"]  = spy["Adj Close"]
spy["open"]   = spy["Open"] / ratio
spy["high"]   = spy["High"] / ratio
spy["low"]    = spy["Low"] / ratio
spy["volume"] = spy["Volume"]

spy = spy[['open','high','low','close','volume']]
spy.head()
Out[4]:
open high low close volume
Date
2000-01-03 107.871804 107.871804 104.688403 105.825332 8164300
2000-01-04 104.438246 104.824835 101.607304 101.686912 8089800
2000-01-05 101.823343 102.982977 99.867825 101.868820 12177900
2000-01-06 101.595958 102.960272 100.231643 100.231643 6227200
2000-01-07 102.096206 106.052718 101.914297 106.052718 8066500
In [5]:
spy.to_csv('~/Desktop/sp500_ohlc.csv')
In [6]:
spy.shape
Out[6]:
(4312, 5)
In [7]:
# calculate returns
spy['return'] = spy['close'].pct_change()
spy['return'].describe()
Out[7]:
count    4311.000000
mean        0.000264
std         0.012463
min        -0.098448
25%        -0.005092
50%         0.000652
75%         0.005935
max         0.145198
Name: return, dtype: float64
In [8]:
# slicing
spy[5:10][["close", "return"]]
Out[8]:
close return
Date
2000-01-10 106.416535 0.003431
2000-01-11 105.143175 -0.011966
2000-01-12 104.097201 -0.009948
2000-01-13 105.506992 0.013543
2000-01-14 106.939489 0.013577
In [9]:
spy[ spy['return'] > 0.005 ]['return'].describe()
Out[9]:
count    1244.000000
mean        0.012878
std         0.009962
min         0.005002
25%         0.006912
50%         0.010097
75%         0.014883
max         0.145198
Name: return, dtype: float64
In [10]:
spy['close'].plot()
Out[10]:
<matplotlib.axes._subplots.AxesSubplot at 0x110e1d9b0>
In [11]:
spy['return'].plot.hist(bins=100, edgecolor='white')
Out[11]:
<matplotlib.axes._subplots.AxesSubplot at 0x112b8a780>
In [12]:
# resampling
spy['return'].resample("1A").sum() * 100
Out[12]:
Date
2000-12-31    -6.429918
2001-12-31   -10.111384
2002-12-31   -20.833354
2003-12-31    26.198963
2004-12-31    10.784570
2005-12-31     5.246394
2006-12-31    15.209572
2007-12-31     6.277696
2008-12-31   -37.358504
2009-12-31    26.929435
2010-12-31    15.631094
2011-12-31     4.527766
2012-12-31    15.638902
2013-12-31    28.622693
2014-12-31    13.265248
2015-12-31     2.414655
2016-12-31    12.184697
2017-12-31     5.575270
Freq: A-DEC, Name: return, dtype: float64

Using Technical Indicators with Python

Technical Indicators with Python

  • For basic indicators use Pandas methods
  • Write your own indicators in Python
  • Use TA-Lib
In [35]:
# pandas' rolling mean
spy['ma1'] = spy['close'].rolling(window=50).mean()
spy['ma2'] = spy['close'].rolling(window=200).mean()
In [13]:
spy[['ma1', 'ma2', 'close']].plot(linewidth=1)
Out[13]:
<matplotlib.axes._subplots.AxesSubplot at 0x1192ebfd0>
In [14]:
# rolling standard deviation
spy['return'].rolling(window=20).std().plot()
Out[14]:
<matplotlib.axes._subplots.AxesSubplot at 0x119a234a8>
In [15]:
# Calculating log returns and volatility
spy['logret'] = np.log(spy['close'] / spy['close'].shift(1))
spy['volatility'] = spy['logret'].rolling(window=252).std() * np.sqrt(252)

spy[['close', 'volatility']].plot(subplots=True)
Out[15]:
array([<matplotlib.axes._subplots.AxesSubplot object at 0x11a0c9518>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x11a0ff198>], dtype=object)
In [17]:
# pure python bollinger bands
spy['sma'] = spy['close'].rolling(window=20).mean()
spy['std'] = spy['close'].rolling(window=20).std()

spy['upperbb'] = spy['sma'] + (spy['std'] * 2)
spy['lowerbb'] = spy['sma'] - (spy['std'] * 2)

plot_candlestick(spy[-100:])
plt.plot(spy[-100:][['upperbb', 'sma', 'lowerbb']], linewidth=1)
Out[17]:
[<matplotlib.lines.Line2D at 0x11a0b7048>,
 <matplotlib.lines.Line2D at 0x11b012898>,
 <matplotlib.lines.Line2D at 0x11b012ba8>]
In [36]:
# using TA-Lib..
import talib as ta

spy['rsi'] = ta.RSI(spy['close'].values, timeperiod=2)
spy[-100:][['close', 'rsi']].plot(subplots=True)
Out[36]:
array([<matplotlib.axes._subplots.AxesSubplot object at 0x11d499d68>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x11d46c320>], dtype=object)