# start by loading data on the returns of the market portfolio and the risk-free rate
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

6.5. Dat APIs#

  • Pandas has an eco-system of APIs that allow you to get data from the web directly into a pandas dataframe

FRED

#!pip install fredapi
from fredapi import Fred
fred = Fred(api_key='f9207136b3333d7cf92c07273f6f5530')
data = fred.get_series('DTB3')
data.plot()
<AxesSubplot:>
../../_images/UsingAPI_2_1.png
data = fred.get_series('VIXCLS')
data.plot()
<AxesSubplot:>
../../_images/UsingAPI_3_1.png

Google Trends

#!pip install pytrends
from pytrends.request import TrendReq
pytrend = TrendReq()
pytrend.build_payload(kw_list=['wallstreetbets','GME'],timeframe='all')
Googletrends=pytrend.interest_over_time()
Googletrends.plot()
<AxesSubplot:xlabel='date'>
../../_images/UsingAPI_5_1.png

WallStreet

  • allow you to get real time data on stocks, calls, and put options

  • Also allow you to very easily get historical data

#!pip install wallstreet
from wallstreet import Stock, Call, Put
s = Stock('GME')
# this is the latest price of GME stock
s.price
25.3
# this gives you historical information
df = s.historical(days_back=1000, frequency='d')
df
Date Open High Low Close Adj Close Volume
0 2020-01-27 1.050000 1.087500 1.040000 1.070000 1.070000 8490000
1 2020-01-28 1.075000 1.075000 1.045000 1.052500 1.052500 8579600
2 2020-01-29 1.055000 1.080000 1.032500 1.032500 1.032500 6210400
3 2020-01-30 1.025000 1.030000 0.970000 0.982500 0.982500 12026000
4 2020-01-31 0.977500 0.985000 0.957500 0.960000 0.960000 11566800
... ... ... ... ... ... ... ...
687 2022-10-17 25.370001 26.410000 25.240000 25.959999 25.959999 3172200
688 2022-10-18 27.100000 27.740000 26.129999 26.670000 26.670000 4402600
689 2022-10-19 26.000000 26.180000 24.270000 24.540001 24.540001 5318400
690 2022-10-20 24.650000 25.450001 24.070000 24.410000 24.410000 5148600
691 2022-10-21 24.150000 25.330000 24.100000 25.299999 25.299999 2999649

692 rows × 7 columns

df.set_index('Date').Close.plot()
plt.title('Wild!')
Text(0.5, 1.0, 'Wild!')
../../_images/UsingAPI_10_1.png

Pandas Datareader

this gives you acess to several data servers

For example the one that we used in Chapter 5.4

We will be particularly interested in the one Ken French updates with several quant strategies

Please check out https://pandas-datareader.readthedocs.io/en/latest/remote_data.html all the possiblities

#!pip install pandas_datareader
from pandas_datareader.famafrench import get_available_datasets
len(get_available_datasets())
297

There are 297 different data sets in Ken French library!

Below I show the full list. You can simply chose one to import, below I am import the '49_Industry_Portfolios' which has the portfolio of US stocks split in 49 industries.

In fact if you print the 'DESCR' field which contains the data set description you see that there are in fact 7 data sets as described below

from datetime import datetime
import pandas_datareader.data as web

start = datetime(1926, 1, 1) # this tell which month to start
ds = web.DataReader('49_Industry_Portfolios', 'famafrench',start=start)
print(ds['DESCR'])
49 Industry Portfolios
----------------------

This file was created by CMPT_IND_RETS using the 202103 CRSP database. It contains value- and equal-weighted returns for 49 industry portfolios. The portfolios are constructed at the end of June. The annual returns are from January to December. Missing data are indicated by -99.99 or -999. Copyright 2021 Kenneth R. French

  0 : Average Value Weighted Returns -- Monthly (1137 rows x 49 cols)
  1 : Average Equal Weighted Returns -- Monthly (1137 rows x 49 cols)
  2 : Average Value Weighted Returns -- Annual (94 rows x 49 cols)
  3 : Average Equal Weighted Returns -- Annual (94 rows x 49 cols)
  4 : Number of Firms in Portfolios (1137 rows x 49 cols)
  5 : Average Firm Size (1137 rows x 49 cols)
  6 : Sum of BE / Sum of ME (95 rows x 49 cols)
  7 : Value-Weighted Average of BE/ME (95 rows x 49 cols)

If you want Value weighted monthly returns you simply select 0

ds[0].tail()
Agric Food Soda Beer Smoke Toys Fun Books Hshld Clths ... Boxes Trans Whlsl Rtail Meals Banks Insur RlEst Fin Other
Date
2020-11 15.18 7.32 9.22 10.88 8.05 21.89 11.40 22.89 5.80 14.27 ... 10.79 17.10 15.33 7.57 11.30 18.25 11.92 21.45 16.78 15.65
2020-12 1.33 1.90 6.84 3.33 8.67 15.77 9.79 10.83 0.86 4.63 ... 2.46 0.73 3.04 1.08 4.05 7.55 4.42 9.68 8.24 1.78
2021-01 2.49 -1.32 -9.60 -6.89 -2.28 6.21 -3.77 3.66 -5.78 -4.84 ... -5.13 -4.58 -1.53 0.80 -4.52 -3.40 -3.85 3.87 -1.25 -1.76
2021-02 12.40 -0.04 0.66 -3.21 5.73 -2.33 10.42 13.70 -0.12 4.07 ... 0.87 8.78 5.70 -3.17 7.97 14.29 4.11 18.83 9.31 5.98
2021-03 3.35 8.07 7.15 8.45 11.71 -8.30 -3.22 5.14 7.31 0.27 ... 6.00 7.84 8.73 5.36 3.15 4.90 9.18 -1.24 4.11 6.03

5 rows × 49 columns

If you want the book to market ratio for each industry you select 6

ds[6].tail()
Agric Food Soda Beer Smoke Toys Fun Books Hshld Clths ... Boxes Trans Whlsl Rtail Meals Banks Insur RlEst Fin Other
Date
2016 0.16 0.35 0.13 0.12 0.10 0.28 0.22 0.80 0.23 0.20 ... 0.36 0.32 0.31 0.22 0.12 0.83 0.72 0.40 0.66 0.36
2017 0.12 0.37 0.12 0.13 0.09 0.25 0.17 0.71 0.22 0.21 ... 0.29 0.25 0.31 0.21 0.16 0.63 0.60 0.33 0.49 0.34
2018 0.46 0.46 0.10 0.14 0.14 0.21 0.13 0.61 0.24 0.16 ... 0.37 0.29 0.32 0.16 0.14 0.57 0.58 0.30 0.46 0.40
2019 0.61 0.44 0.08 0.14 0.17 0.17 0.14 0.71 0.18 0.16 ... 0.36 0.28 0.33 0.16 0.11 0.58 0.55 0.42 0.53 0.34
2020 1.14 0.45 0.10 0.17 0.09 0.20 0.14 0.80 0.15 0.16 ... 0.40 0.31 0.33 0.14 0.23 0.75 0.69 0.56 0.55 0.49

5 rows × 49 columns

Below is the full list of datasets

get_available_datasets()
['F-F_Research_Data_Factors',
 'F-F_Research_Data_Factors_weekly',
 'F-F_Research_Data_Factors_daily',
 'F-F_Research_Data_5_Factors_2x3',
 'F-F_Research_Data_5_Factors_2x3_daily',
 'Portfolios_Formed_on_ME',
 'Portfolios_Formed_on_ME_Wout_Div',
 'Portfolios_Formed_on_ME_Daily',
 'Portfolios_Formed_on_BE-ME',
 'Portfolios_Formed_on_BE-ME_Wout_Div',
 'Portfolios_Formed_on_BE-ME_Daily',
 'Portfolios_Formed_on_OP',
 'Portfolios_Formed_on_OP_Wout_Div',
 'Portfolios_Formed_on_OP_Daily',
 'Portfolios_Formed_on_INV',
 'Portfolios_Formed_on_INV_Wout_Div',
 'Portfolios_Formed_on_INV_Daily',
 '6_Portfolios_2x3',
 '6_Portfolios_2x3_Wout_Div',
 '6_Portfolios_2x3_weekly',
 '6_Portfolios_2x3_daily',
 '25_Portfolios_5x5',
 '25_Portfolios_5x5_Wout_Div',
 '25_Portfolios_5x5_Daily',
 '100_Portfolios_10x10',
 '100_Portfolios_10x10_Wout_Div',
 '100_Portfolios_10x10_Daily',
 '6_Portfolios_ME_OP_2x3',
 '6_Portfolios_ME_OP_2x3_Wout_Div',
 '6_Portfolios_ME_OP_2x3_daily',
 '25_Portfolios_ME_OP_5x5',
 '25_Portfolios_ME_OP_5x5_Wout_Div',
 '25_Portfolios_ME_OP_5x5_daily',
 '100_Portfolios_ME_OP_10x10',
 '100_Portfolios_10x10_ME_OP_Wout_Div',
 '100_Portfolios_ME_OP_10x10_daily',
 '6_Portfolios_ME_INV_2x3',
 '6_Portfolios_ME_INV_2x3_Wout_Div',
 '6_Portfolios_ME_INV_2x3_daily',
 '25_Portfolios_ME_INV_5x5',
 '25_Portfolios_ME_INV_5x5_Wout_Div',
 '25_Portfolios_ME_INV_5x5_daily',
 '100_Portfolios_ME_INV_10x10',
 '100_Portfolios_10x10_ME_INV_Wout_Div',
 '100_Portfolios_ME_INV_10x10_daily',
 '25_Portfolios_BEME_OP_5x5',
 '25_Portfolios_BEME_OP_5x5_Wout_Div',
 '25_Portfolios_BEME_OP_5x5_daily',
 '25_Portfolios_BEME_INV_5x5',
 '25_Portfolios_BEME_INV_5x5_Wout_Div',
 '25_Portfolios_BEME_INV_5x5_daily',
 '25_Portfolios_OP_INV_5x5',
 '25_Portfolios_OP_INV_5x5_Wout_Div',
 '25_Portfolios_OP_INV_5x5_daily',
 '32_Portfolios_ME_BEME_OP_2x4x4',
 '32_Portfolios_ME_BEME_OP_2x4x4_Wout_Div',
 '32_Portfolios_ME_BEME_INV_2x4x4',
 '32_Portfolios_ME_BEME_INV_2x4x4_Wout_Div',
 '32_Portfolios_ME_OP_INV_2x4x4',
 '32_Portfolios_ME_OP_INV_2x4x4_Wout_Div',
 'Portfolios_Formed_on_E-P',
 'Portfolios_Formed_on_E-P_Wout_Div',
 'Portfolios_Formed_on_CF-P',
 'Portfolios_Formed_on_CF-P_Wout_Div',
 'Portfolios_Formed_on_D-P',
 'Portfolios_Formed_on_D-P_Wout_Div',
 '6_Portfolios_ME_EP_2x3',
 '6_Portfolios_ME_EP_2x3_Wout_Div',
 '6_Portfolios_ME_CFP_2x3',
 '6_Portfolios_ME_CFP_2x3_Wout_Div',
 '6_Portfolios_ME_DP_2x3',
 '6_Portfolios_ME_DP_2x3_Wout_Div',
 'F-F_Momentum_Factor',
 'F-F_Momentum_Factor_daily',
 '6_Portfolios_ME_Prior_12_2',
 '6_Portfolios_ME_Prior_12_2_Daily',
 '25_Portfolios_ME_Prior_12_2',
 '25_Portfolios_ME_Prior_12_2_Daily',
 '10_Portfolios_Prior_12_2',
 '10_Portfolios_Prior_12_2_Daily',
 'F-F_ST_Reversal_Factor',
 'F-F_ST_Reversal_Factor_daily',
 '6_Portfolios_ME_Prior_1_0',
 '6_Portfolios_ME_Prior_1_0_Daily',
 '25_Portfolios_ME_Prior_1_0',
 '25_Portfolios_ME_Prior_1_0_Daily',
 '10_Portfolios_Prior_1_0',
 '10_Portfolios_Prior_1_0_Daily',
 'F-F_LT_Reversal_Factor',
 'F-F_LT_Reversal_Factor_daily',
 '6_Portfolios_ME_Prior_60_13',
 '6_Portfolios_ME_Prior_60_13_Daily',
 '25_Portfolios_ME_Prior_60_13',
 '25_Portfolios_ME_Prior_60_13_Daily',
 '10_Portfolios_Prior_60_13',
 '10_Portfolios_Prior_60_13_Daily',
 'Portfolios_Formed_on_AC',
 '25_Portfolios_ME_AC_5x5',
 'Portfolios_Formed_on_BETA',
 '25_Portfolios_ME_BETA_5x5',
 'Portfolios_Formed_on_NI',
 '25_Portfolios_ME_NI_5x5',
 'Portfolios_Formed_on_VAR',
 '25_Portfolios_ME_VAR_5x5',
 'Portfolios_Formed_on_RESVAR',
 '25_Portfolios_ME_RESVAR_5x5',
 '5_Industry_Portfolios',
 '5_Industry_Portfolios_Wout_Div',
 '5_Industry_Portfolios_daily',
 '10_Industry_Portfolios',
 '10_Industry_Portfolios_Wout_Div',
 '10_Industry_Portfolios_daily',
 '12_Industry_Portfolios',
 '12_Industry_Portfolios_Wout_Div',
 '12_Industry_Portfolios_daily',
 '17_Industry_Portfolios',
 '17_Industry_Portfolios_Wout_Div',
 '17_Industry_Portfolios_daily',
 '30_Industry_Portfolios',
 '30_Industry_Portfolios_Wout_Div',
 '30_Industry_Portfolios_daily',
 '38_Industry_Portfolios',
 '38_Industry_Portfolios_Wout_Div',
 '38_Industry_Portfolios_daily',
 '48_Industry_Portfolios',
 '48_Industry_Portfolios_Wout_Div',
 '48_Industry_Portfolios_daily',
 '49_Industry_Portfolios',
 '49_Industry_Portfolios_Wout_Div',
 '49_Industry_Portfolios_daily',
 'ME_Breakpoints',
 'BE-ME_Breakpoints',
 'OP_Breakpoints',
 'INV_Breakpoints',
 'E-P_Breakpoints',
 'CF-P_Breakpoints',
 'D-P_Breakpoints',
 'Prior_2-12_Breakpoints',
 'Developed_3_Factors',
 'Developed_3_Factors_Daily',
 'Developed_ex_US_3_Factors',
 'Developed_ex_US_3_Factors_Daily',
 'Europe_3_Factors',
 'Europe_3_Factors_Daily',
 'Japan_3_Factors',
 'Japan_3_Factors_Daily',
 'Asia_Pacific_ex_Japan_3_Factors',
 'Asia_Pacific_ex_Japan_3_Factors_Daily',
 'North_America_3_Factors',
 'North_America_3_Factors_Daily',
 'Developed_5_Factors',
 'Developed_5_Factors_Daily',
 'Developed_ex_US_5_Factors',
 'Developed_ex_US_5_Factors_Daily',
 'Europe_5_Factors',
 'Europe_5_Factors_Daily',
 'Japan_5_Factors',
 'Japan_5_Factors_Daily',
 'Asia_Pacific_ex_Japan_5_Factors',
 'Asia_Pacific_ex_Japan_5_Factors_Daily',
 'North_America_5_Factors',
 'North_America_5_Factors_Daily',
 'Developed_Mom_Factor',
 'Developed_Mom_Factor_Daily',
 'Developed_ex_US_Mom_Factor',
 'Developed_ex_US_Mom_Factor_Daily',
 'Europe_Mom_Factor',
 'Europe_Mom_Factor_Daily',
 'Japan_Mom_Factor',
 'Japan_Mom_Factor_Daily',
 'Asia_Pacific_ex_Japan_MOM_Factor',
 'Asia_Pacific_ex_Japan_MOM_Factor_Daily',
 'North_America_Mom_Factor',
 'North_America_Mom_Factor_Daily',
 'Developed_6_Portfolios_ME_BE-ME',
 'Developed_6_Portfolios_ME_BE-ME_daily',
 'Developed_ex_US_6_Portfolios_ME_BE-ME',
 'Developed_ex_US_6_Portfolios_ME_BE-ME_daily',
 'Europe_6_Portfolios_ME_BE-ME',
 'Europe_6_Portfolios_ME_BE-ME_daily',
 'Japan_6_Portfolios_ME_BE-ME',
 'Japan_6_Portfolios_ME_BE-ME_daily',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_BE-ME',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_BE-ME_daily',
 'North_America_6_Portfolios_ME_BE-ME',
 'North_America_6_Portfolios_ME_BE-ME_daily',
 'Developed_25_Portfolios_ME_BE-ME',
 'Developed_25_Portfolios_ME_BE-ME_daily',
 'Developed_ex_US_25_Portfolios_ME_BE-ME',
 'Developed_ex_US_25_Portfolios_ME_BE-ME_daily',
 'Europe_25_Portfolios_ME_BE-ME',
 'Europe_25_Portfolios_ME_BE-ME_daily',
 'Japan_25_Portfolios_ME_BE-ME',
 'Japan_25_Portfolios_ME_BE-ME_daily',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_BE-ME',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_BE-ME_daily',
 'North_America_25_Portfolios_ME_BE-ME',
 'North_America_25_Portfolios_ME_BE-ME_daily',
 'Developed_6_Portfolios_ME_OP',
 'Developed_6_Portfolios_ME_OP_Daily',
 'Developed_ex_US_6_Portfolios_ME_OP',
 'Developed_ex_US_6_Portfolios_ME_OP_Daily',
 'Europe_6_Portfolios_ME_OP',
 'Europe_6_Portfolios_ME_OP_Daily',
 'Japan_6_Portfolios_ME_OP',
 'Japan_6_Portfolios_ME_OP_Daily',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_OP',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_OP_Daily',
 'North_America_6_Portfolios_ME_OP',
 'North_America_6_Portfolios_ME_OP_Daily',
 'Developed_25_Portfolios_ME_OP',
 'Developed_25_Portfolios_ME_OP_Daily',
 'Developed_ex_US_25_Portfolios_ME_OP',
 'Developed_ex_US_25_Portfolios_ME_OP_Daily',
 'Europe_25_Portfolios_ME_OP',
 'Europe_25_Portfolios_ME_OP_Daily',
 'Japan_25_Portfolios_ME_OP',
 'Japan_25_Portfolios_ME_OP_Daily',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_OP',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_OP_Daily',
 'North_America_25_Portfolios_ME_OP',
 'North_America_25_Portfolios_ME_OP_Daily',
 'Developed_6_Portfolios_ME_INV',
 'Developed_6_Portfolios_ME_INV_Daily',
 'Developed_ex_US_6_Portfolios_ME_INV',
 'Developed_ex_US_6_Portfolios_ME_INV_Daily',
 'Europe_6_Portfolios_ME_INV',
 'Europe_6_Portfolios_ME_INV_Daily',
 'Japan_6_Portfolios_ME_INV',
 'Japan_6_Portfolios_ME_INV_Daily',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_INV',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_INV_Daily',
 'North_America_6_Portfolios_ME_INV',
 'North_America_6_Portfolios_ME_INV_Daily',
 'Developed_25_Portfolios_ME_INV',
 'Developed_25_Portfolios_ME_INV_Daily',
 'Developed_ex_US_25_Portfolios_ME_INV',
 'Developed_ex_US_25_Portfolios_ME_INV_Daily',
 'Europe_25_Portfolios_ME_INV',
 'Europe_25_Portfolios_ME_INV_Daily',
 'Japan_25_Portfolios_ME_INV',
 'Japan_25_Portfolios_ME_INV_Daily',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_INV',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_INV_Daily',
 'North_America_25_Portfolios_ME_INV',
 'North_America_25_Portfolios_ME_INV_Daily',
 'Developed_6_Portfolios_ME_Prior_12_2',
 'Developed_6_Portfolios_ME_Prior_250_20_daily',
 'Developed_ex_US_6_Portfolios_ME_Prior_12_2',
 'Developed_ex_US_6_Portfolios_ME_Prior_250_20_daily',
 'Europe_6_Portfolios_ME_Prior_12_2',
 'Europe_6_Portfolios_ME_Prior_250_20_daily',
 'Japan_6_Portfolios_ME_Prior_12_2',
 'Japan_6_Portfolios_ME_Prior_250_20_daily',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_Prior_12_2',
 'Asia_Pacific_ex_Japan_6_Portfolios_ME_Prior_250_20_daily',
 'North_America_6_Portfolios_ME_Prior_12_2',
 'North_America_6_Portfolios_ME_Prior_250_20_daily',
 'Developed_25_Portfolios_ME_Prior_12_2',
 'Developed_25_Portfolios_ME_Prior_250_20_daily',
 'Developed_ex_US_25_Portfolios_ME_Prior_12_2',
 'Developed_ex_US_25_Portfolios_ME_Prior_250_20_daily',
 'Europe_25_Portfolios_ME_Prior_12_2',
 'Europe_25_Portfolios_ME_Prior_250_20_daily',
 'Japan_25_Portfolios_ME_Prior_12_2',
 'Japan_25_Portfolios_ME_Prior_250_20_daily',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_Prior_12_2',
 'Asia_Pacific_ex_Japan_25_Portfolios_ME_Prior_250_20_daily',
 'North_America_25_Portfolios_ME_Prior_12_2',
 'North_America_25_Portfolios_ME_Prior_250_20_daily',
 'Developed_32_Portfolios_ME_BE-ME_OP_2x4x4',
 'Developed_ex_US_32_Portfolios_ME_BE-ME_OP_2x4x4',
 'Europe_32_Portfolios_ME_BE-ME_OP_2x4x4',
 'Japan_32_Portfolios_ME_BE-ME_OP_2x4x4',
 'Asia_Pacific_ex_Japan_32_Portfolios_ME_BE-ME_OP_2x4x4',
 'North_America_32_Portfolios_ME_BE-ME_OP_2x4x4',
 'Developed_32_Portfolios_ME_BE-ME_INV(TA)_2x4x4',
 'Developed_ex_US_32_Portfolios_ME_BE-ME_INV(TA)_2x4x4',
 'Europe_32_Portfolios_ME_BE-ME_INV(TA)_2x4x4',
 'Japan_32_Portfolios_ME_BE-ME_INV(TA)_2x4x4',
 'Asia_Pacific_ex_Japan_32_Portfolios_ME_BE-ME_INV(TA)_2x4x4',
 'North_America_32_Portfolios_ME_BE-ME_INV(TA)_2x4x4',
 'Developed_32_Portfolios_ME_INV(TA)_OP_2x4x4',
 'Developed_ex_US_32_Portfolios_ME_INV(TA)_OP_2x4x4',
 'Europe_32_Portfolios_ME_INV(TA)_OP_2x4x4',
 'Japan_32_Portfolios_ME_INV(TA)_OP_2x4x4',
 'Asia_Pacific_ex_Japan_32_Portfolios_ME_INV(TA)_OP_2x4x4',
 'North_America_32_Portfolios_ME_INV(TA)_OP_2x4x4',
 'Emerging_5_Factors',
 'Emerging_MOM_Factor',
 'Emerging_Markets_6_Portfolios_ME_BE-ME',
 'Emerging_Markets_6_Portfolios_ME_OP',
 'Emerging_Markets_6_Portfolios_ME_INV',
 'Emerging_Markets_6_Portfolios_ME_Prior_12_2',
 'Emerging_Markets_4_Portfolios_BE-ME_OP',
 'Emerging_Markets_4_Portfolios_OP_INV',
 'Emerging_Markets_4_Portfolios_BE-ME_INV']

WRDS

This is harder and more sophisticated

You DONT NEED TO USE THIS IF YOU DONT WANT!

WRDS has all sorts of amazing data for finance. For equity prices and mutual funds in particular.

You can easily register for an account here https://wrds-www.wharton.upenn.edu/register/

In this class we will use it go get stock market data and also data on company earnigns and other balance sheet information

Google/Yahoo finance data are nice to play with, but they tend to have errors in it. So if you are doing serious work I will probably advise agaist using them only.

WRDS is really the gold standard, but the data publication lag is long, so you can use it to do research, but not to trade

To use it you have to load two libraries

#!pip install wrds
import wrds
import psycopg2 
Collecting wrds
  Downloading wrds-3.1.2-py3-none-any.whl (13 kB)
Requirement already satisfied: sqlalchemy in c:\users\alan.moreira\anaconda3\lib\site-packages (from wrds) (1.4.32)
Collecting psycopg2-binary
  Downloading psycopg2_binary-2.9.4-cp39-cp39-win_amd64.whl (1.2 MB)
Requirement already satisfied: mock in c:\users\alan.moreira\anaconda3\lib\site-packages (from wrds) (4.0.3)
Requirement already satisfied: pandas in c:\users\alan.moreira\anaconda3\lib\site-packages (from wrds) (1.4.2)
Requirement already satisfied: numpy in c:\users\alan.moreira\anaconda3\lib\site-packages (from wrds) (1.21.5)
Requirement already satisfied: pytz>=2020.1 in c:\users\alan.moreira\anaconda3\lib\site-packages (from pandas->wrds) (2021.3)
Requirement already satisfied: python-dateutil>=2.8.1 in c:\users\alan.moreira\anaconda3\lib\site-packages (from pandas->wrds) (2.8.2)
Requirement already satisfied: six>=1.5 in c:\users\alan.moreira\anaconda3\lib\site-packages (from python-dateutil>=2.8.1->pandas->wrds) (1.16.0)
Requirement already satisfied: greenlet!=0.4.17 in c:\users\alan.moreira\anaconda3\lib\site-packages (from sqlalchemy->wrds) (1.1.1)
Installing collected packages: psycopg2-binary, wrds
Successfully installed psycopg2-binary-2.9.4 wrds-3.1.2

You then initiate a connection when you are requested to type your pasword and username

conn=wrds.Connection()
Enter your WRDS username [Alan.Moreira]: moreira5
Enter your password: ···········
WRDS recommends setting up a .pgpass file.
Create .pgpass file now [y/n]?:  y
Created .pgpass file successfully.
Loading library list...
Done

With this connection you can use their SQL interface to get the desired data

Here I recommend most copying and pasting code across the web.

This website by Freda Drechsler provides lots of guidance and code on how to do it

https://www.fredasongdrechsler.com/intro-to-python-for-fnce/maneuvering-wrds-data

and also has lots of code that teaches you how to get both financial and accouting information from WRDS, for example, (https://www.fredasongdrechsler.com/full-python-code/dgtw)

Below we will simply write a function that dowloads dailt return data for a given firm

def get_returns(tickers,conn,startdate,enddate):
    ticker=tickers[0]
    df = conn.raw_sql("""
                          select a.permno, b.ticker, a.date,a.ret 
                          from crsp.dsf as a
                          left join crsp.msenames as b
                          on a.permno=b.permno
                          and b.namedt<=a.date
                          and a.date<=b.nameendt
                          where a.date between '"""+ startdate+"""'  and '"""+ enddate+            
                          """' and  b.ticker='"""+ticker+"'")
            
    df.set_index(['date','permno'],inplace=True)
    df=df['ret'].unstack()
    return df
df=get_returns(['TSLA'],conn,'1/1/2017','1/1/2022')
df.plot()
<AxesSubplot:xlabel='date'>
../../_images/UsingAPI_27_1.png

If I wanted other information I would have to know this information is named in the database

For example,

  • vol gets tradign volume (in shares),

  • prc gets the unadjusted price,

  • retx gets returns without dividends

  • shrout gets share outstanding

In the end of this notebook you put a list of data that this particular data set in wrds allow you easy access to.It also has a description of how missing values are handled.

To construct the market cap you need both the price and the shareoutstanding, which you can easily download with a single function call by adding to your code

def get_data(tickers,conn,startdate,enddate):
    ticker=tickers[0]
    df = conn.raw_sql("""
                          select a.permno, b.ticker, a.date,a.prc,a.shrout 
                          from crsp.dsf as a
                          left join crsp.msenames as b
                          on a.permno=b.permno
                          and b.namedt<=a.date
                          and a.date<=b.nameendt
                          where a.date between '"""+ startdate+"""'  and '"""+ enddate+            
                          """' and  b.ticker='"""+ticker+"'")
            
    df.set_index(['date','permno'],inplace=True)
 #   I commented out this step because now I would like to get more than one variable so I cannot unstack
  #  df=df[variable].unstack()
    return df
df=get_data(['TSLA'],conn,'1/1/2005','1/1/2022')
df.tail()
ticker prc shrout
date permno
2021-12-27 93436.0 TSLA 1093.939941 1004265.0
2021-12-28 93436.0 TSLA 1088.469971 1004265.0
2021-12-29 93436.0 TSLA 1086.189941 1004265.0
2021-12-30 93436.0 TSLA 1070.339966 1004265.0
2021-12-31 93436.0 TSLA 1056.780029 1033534.0
df['mcap']=df.prc.abs()*df.shrout
# the use a negative price to say that the price is not based on end of day transaction 
#but a mid price between bid and ask
df.mcap.plot()
<AxesSubplot:xlabel='date,permno'>
../../_images/UsingAPI_32_1.png

This is ugly because we have the permno, the indentity of security as an index. To fix this, lets take this out of the index

df=df.reset_index().set_index('date')
df.mcap.plot()
<AxesSubplot:xlabel='date'>
../../_images/UsingAPI_34_1.png

Variable reference for WRDS CRSP data set image.png image.png image.png