Assignment 6
Contents
17.28. Assignment 6#
Instructions: This problem set should be done in a group.
Your group was assigned at orientation and you can find it in blackboard as well.
You will use this same group to do your final project.
Answer each question in the designated space below.
After you are done. save and upload in blackboard.
Please check that you are submitting the correct file. One way to avoid mistakes is to save it with a different name.
17.29. Names of your group members#
Please write names below
[Name]:
[Name]:
[Name]:
[Name]:
[Name]:
17.30. Discussion Forum Assignment#
Do the Value investing assignment in the discussion forum
17.31. Exercises#
1. Data Cleaning
You will work with the same dataset as in Assigment 5. The dataset has address
url='https://github.com/amoreira2/Lectures/blob/main/assets/data/Assignment5.xlsx?raw=true'
Do the followings:
Import pandas, numpy, matplotlib, and load the data set.
Import the datasets of industry returns and risk free rate.
Parse the date.
Set the index.
Drop missing observations.
Construct a dataframe with only excess returns.
Call this dataframe with the 49 excess returns time series
df
.Call
df.head()
to check if everything works
Hint: You did it in assignment 5, simply copy and paste your code.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pandas.tseries.offsets import MonthEnd
# your code below
df.head()
2. Expected excess return estimation
Compute the sample mean as the estimators for the expected excess returns of the 49 assets.
Call this ERe
.
# your code below
ERe.head()
3. Expected excess return uncertainty
We will now construct an estimator for the amount of uncertainty in our sample mean estimator. If we assume that each individual asset is uncorrelated over time (not terrible assumption), then the variance of the mean is
So all you need is the sample size (T) and the variance of each asset to obtain the varaince of our estimator.
Please use this formula to compute the STANDARD DEVIATION of sample average estimator of each 49 asset. Call this ERe_se
.
# your code below
ERe_se.head()
4. Constructing a confidence interval for the expected excess return, part 1
We will now want to construct the 95% confidence interval for our estimator. The interval is such that it contains the true mean 95% of the time.
The way to do this is to use the normal distribution CDF to figure out the threshold that leaves only 2.5% probability at the each side of the tails.
Why 2.5% and not 5%? Because it is symmetric, so there is 2.5% probability in the left tail and 2.5 % in the right tail so overall there is only 5% probability that the expected return is out of the interval. Thus there is 95% probability that it is in the interval.
In this exercise, you will find the threshold by doing the followings:
import the stats library from the scipy package with
from scipy import stats
get the standard normal distribution with
sn=stats.norm(0,1)
, where 0 is the mean and 1 is the standard deviationget the threshold by using inverse cumulative distribution function for the appropriate
prob_value
to create a 95% CI (see discussion above).threshold=sn.isf(prob_value)
make sure that this threshold is positive (if you got from the left tail, you will have to take the absolute value or just get from the right tail).
make sure you did things correctly by calling
print(threshold)
.
Hint:
You can always check in the normal table that you did things correctly https://en.wikipedia.org/wiki/Normal_distribution.
# your code below
print(threshold)
5. Constructing a confidence interval for the expected excess return, part 2
Armed with these threshold you can construct the interval as follows
Do the followings:
create an empty dataframe which has the names of industries as index and ‘lower’ and ‘upper’ as column names. Name it
ERe_ci
.construct the lower bound of the interval, \(\bar{r}-threshold\times\sigma(\bar{r})\), and store it in the column of ‘lower’
compute the upper bound symmetrically, and store it in the column of ‘upper’
call
ERe_ci.head()
# your code below
ERe_ci.head()
6. Compute the tangency portfolio weights for a portfolio with annualized volatility of 10%
Store these in a dataframe whose rows are the names of the assets and the first column has the label ‘mve_data’.
Name this data frame Weights
.
print(Weights)
TIP: You did this in Assigment 5.
# your code below
print(Weights.head())
7. Sensitivity to uncertainty of the Tangent portfolio calculation
Now we will compute the tangency portfolio but using a slightly different estimate for the mean.
Do the followings.
instead of using the sample mean for each asset, first we will pick one asset,
Hlth
.change its mean to be its lower bound of CI in Exercise 5 and then recalculate the tangency portfolio weights.
store this in dataframe Weights with the column name
mve_Hlth-1.95
.create another column with weights computed from the perturbation in which its mean is changed to be the upper bound of CI, label this column
mve_Hlth+1.95
.do a bar plot of these three sets of weights using Weights.plot.bar().
Discuss what you notice in the bar plot:
How much do the weights change?
Which assets are impacted? Why?
Hint:
you might want to create a copy of your ERe estimator before you do the perturbation.
# your code below
# your discussion below
# 1. How much do the weights change?
# 2. Which assets are impacted? Why?
8. Performance impact of estimation uncertainty
Your Weight dataframe has 3 different weight schemes.
Do the followings:
compute the in-sample Sharpe Ratio for these 3 different weight schemes.
discuss the results that you obtained
Hint:
You should use the real data (e.g. df
,ERe
,CovRe
) to compute the Sharpe ratio.
# your code below
# your discussion below
9. Reporduce analysis of Exercises 7-8 for all assets
Do the followings:
use a for loop to loop through the 49 portfolios and create the “perturbed” weights and the Sharpe Ratio of the perturbed weights.
record for each asset the average drop in the Sharpe Ratio associated with the perturbation in the tangency portfolio weights.
store the results in a dataframe named
dSR
(difference in SR): $\(dSR[asset]=\frac{1}{2}\frac{SR(asset+1.95)+SR(asset-1.95)}{SR(data)}\)$ where SR(asset+1.95) and SR(asset-1.95) were the Sharpe ratios obtained when you perturb the expected excess return of that asset to the upper and lower bound of the CI.dSR
should be a dataframe with industry names as index and a column, calledSR_change
, containing the results of the calculation from the expression above.do a bar plot of this Sharpe ratio change.
Discuss the bar plot:
What do you think is the key takeaway from the analysis above
Hint: Note that all you need to do here is to get the code you developed above and adapt it to work with a for loop.
# your code below
# your discussion below
10. Monte Carlo, part 1
So far our focus is on the estimation uncertainty of risk premiums. Covariance matrices also need to be estimated.
You will now implement a Monte-Carlo method to evaluate the overall uncertainty in the construction of the tangency portfolio.
You already have the sample estimates from the vector of expected excess returns ERe
and the variance-covariance matrix CovRe
(you used those in Exercise 6).
Now you use the function np.random.multivariate_normal
to simulate draws from a multivariate normal distribution with vector of mean equal to ERe
and the covariance matrix equal to CovRe
.
Do the following:
write the code that draws ONE realization of returns for this set of 49 assets.
Hint:
you should get a vector 49 by 1 that changes every time you run the cell.
type np.random.multivariate_normal?
to see how this function works
# your code below
11. Monte Carlo, part 2
Do the followings:
now set the parameter
size
in themultivaraite_normal
function to draw T realizations of the 49 assets, where you set T to the number of months you have in the data set.print the shape of your draw. This should return you a T by N matrix of returns, something with exactly the same shape as our data set.
Hint: every time you run the cell again you get a different realization.
# your code below
12. Monte Carlo, part 3
Do the followings:
copy the code above, so you have a simulated sample of monthly industry returns.
use the simulated return data and the weights in
mve_data
column of the dataframeWeights
to construct a time-series of portfolio excess return.compute and print its Sharpe Ratio.
Hint:
Every time you run this cell you should get a different Sharpe Ratio. This variation reflects the amount of overall uncertainty built in our investment strategy.
# your code below
13. Monte Carlo, part 4
Now copy the code of the question above and write a foor loop around it.
Do the followings:
loop throught this code 1000 times and each time record the resulting Sharpe Ratio.
cave this in a dataframe called
MC
.create a histogram of these Sharpe Ratios with 50 bins using the method
.hist
Discuss the plot:
what do you conclude?
# your code below
# your discussion below
14. Bootstrap
The Monte-Carlo approach assumes that the distribution of returns is normal which is a good approximation but not literally true. It turns out that we can use another approach called Bootstrap that instead of sampling from the normal distribution, we sample from the actual data.
Basically, bootstrap approach randomly draws one observation (one month of 49 industy returns in our case) from the real dataset, treats it as a new observation in the simulated sample and repeats this process until the simulated sample has needed sample size. Observations are drawn with replacement, which means we draw from the whole real dataset every time so it is very likely that some data points are drawn more than once.
Do the followings:
start with your code from question 13,
instead of calling
np.random.multivariate_normal
to draw a sample, we will use the methodsample
refer to https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sample.html
set the parameter ‘frac’ to 1 so it draws a distribution of excactly same size as our original sample
set the parameter ‘replace’ to ‘True’ so it samples with replacement (otherwise you will get exactly the same realizations, only in different order).
now you can simply plug this realization in your code from question 13.
save the results in a dataframe called
Boot
.create a histogram of these Sharpe Ratios with 50 bins using the method
.hist
Compare the results in 13 and 14, and explain:
does the key takeway change?
does the distribution of the SR of our strategy change?
# your code below
# your discussion below
15. Please explain why an investor should care about these results.
# your answer below