Slide 1

Slide 1 text

Harry Park Manipulating Financial Data in Python

Slide 2

Slide 2 text

TABLE OF CONTENTS Introduction Processing Stock Data 01 02 03 04 Using Processed Stock Data Conclusions

Slide 3

Slide 3 text

Introduction 01

Slide 4

Slide 4 text

About this Presentation How to process historical financial data using Python Example usages of processed stock data

Slide 5

Slide 5 text

Why Python for Financial Data? Python Strong scientific libraries Well maintained Rich ML libraries

Slide 6

Slide 6 text

Useful Python Libraries for Finance “A fast, powerful, flexible and easy to use open source data analysis and manipulation tool” Pandas - read_csv() - DataFrame - joining and slicing dataframes

Slide 7

Slide 7 text

Useful Python Libraries for Finance NumPy “Powerful N-Dimensional Arrays” “Numerical Computing Tools” matplotlib/ seaborn yfinance “Offers a reliable, threaded, and Pythonic way to download historical market data from Yahoo! finance” Python Visualization Libraries

Slide 8

Slide 8 text

Processing Stock Data 02

Slide 9

Slide 9 text

Stock Data in the Real World .csv Comma Separated Values

Slide 10

Slide 10 text

Stock Data in the Real World

Slide 11

Slide 11 text

Stock Data in the Real World pandas.read_csv()

Slide 12

Slide 12 text

Combining Multiple Stock Data Find Important Statistical Values Performing Statistical Analysis Plotting Multiple Stocks Compare Stocks in a Intuitive Way Portfolio Optimization Find the Best Proportions of Stocks to Own

Slide 13

Slide 13 text

Combining Multiple Stock Data df.join()

Slide 14

Slide 14 text

Problem #1. Trading Days df.dropna()

Slide 15

Slide 15 text

Problem #1. Trading Days Step 1 Create an empty DataFrame with the date range of Interest dates = pd.date_ragne(start_date, end_date) df = pd.DataFrame(index = dates) Step 2 Join it with the DataFrame of a stock dfAPPL = pd.read_csv(“apple.csv”) df = df.join(dfAPPL) Step 3 Drop no-trade dates df.dropna(subset=[column_name])

Slide 16

Slide 16 text

Problem #2. Incomplete Data What if a stock is no longer traded? What if a new stock share began trading recently? JAVA TSLA

Slide 17

Slide 17 text

Problem #2. Incomplete Data df.fillna(method = ‘ffill’) Time Price

Slide 18

Slide 18 text

Problem #2. Incomplete Data df.fillna(method = ‘bfill’) Time Price

Slide 19

Slide 19 text

Problem #2. Incomplete Data Forward Fill before Backward Fill. To minimize Peeking into Future.

Slide 20

Slide 20 text

Combining Multiple Stocks Putting It All Together Step 1 Create an empty DataFrame with the date range of your interest Step 2 Join it with the DataFrame of a stock Step 3 Drop the no-trade dates Step 4 Repeat Step 2 ~ Step 3 with remaining stocks Step 5 Fill forward missing values Step 6 Fill backward missing values

Slide 21

Slide 21 text

Using Processed Stock Data 03

Slide 22

Slide 22 text

Assumption GOOGL APPL TSLA 2000.xx.xx price 1 price 5 price 9 2000.xx.xx price 2 price 6 price 10 2000.xx.xx price 3 price 7 price 11 2000.xx.xx price 4 price 8 price 12 Combined DataFrame

Slide 23

Slide 23 text

Plotting Multiple Stocks in a Graph df / df[0] Time Time P r i c e

Slide 24

Slide 24 text

Computing Financial Statistics Cumulative Return (Current Price - Original Price) / Original Price ex) 0.03, -0.05

Slide 25

Slide 25 text

Computing Financial Statistics Sharpe Ratio (return of portfolio - risk-free rate) / std of portfolio ex) 1.43, 0.74 Simply put, a metric for risk-adjusted return

Slide 26

Slide 26 text

Portfolio Optimization What is a portfolio?

Slide 27

Slide 27 text

Portfolio Optimization “The process of selecting the best portfolio (asset allocation), out of the set of all portfolios being considered, according to some objective.”

Slide 28

Slide 28 text

Portfolio Optimization What should be the objective function to minimize? -1 * Cumulative Return -1 * Sharpe Ratio (risk-adjusted return)

Slide 29

Slide 29 text

Portfolio Optimization Find the ‘x’ to minimize f(x) = -1 * Sharpe Ratio scipy.optimize.minimize()

Slide 30

Slide 30 text

Conclusions 04

Slide 31

Slide 31 text

How to utilize some Python libraries for Finance Problems with stock data and how to solve them Example usages of processed stock data Takeaways

Slide 32

Slide 32 text

What’s Next? Computational Investing Machine Learning for Trading

Slide 33

Slide 33 text

CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik. Please keep this slide for attribution. Does anyone have any questions? THANKS

Slide 34

Slide 34 text

https://pandas.pydata.org/ https://numpy.org/ https://matplotlib.org/ https://seaborn.pydata.org/ https://en.wikipedia.org/wiki/Tesla,_Inc. https://en.wikipedia.org/wiki/Sun_Microsystems https://www.portfoliovisualizer.com CS 7646 Machine Learning for Trading from Georgia Tech References