Exercises#

Scientific/Numpy#

Exercises#

NEED TO CHECK IN exercises are approproiate

Exercise 1

Try indexing into another element of your choice from the 3-dimensional array.

Building an understanding of indexing means working through this type of operation several times – without skipping steps!

(back to text)

Exercise 2

Look at the 2-dimensional array x_2d.

Does the inner-most index correspond to rows or columns? What does the outer-most index correspond to?

Write your thoughts.

(back to text)

Exercise 3

What would you do to extract the array [[5, 6], [50, 60]]?

(back to text)

Exercise 4

Do you recall what multiplication by an integer did for lists?

How does this differ?

(back to text)

Exercise 5

Let’s revisit a bond pricing example we saw in Control flow.

Recall that the equation for pricing a bond with coupon payment \( C \), face value \( M \), yield to maturity \( i \), and periods to maturity \( N \) is

\[\begin{split} \begin{align*} P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\ &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N} \end{align*} \end{split}\]

In the code cell below, we have defined variables for i, M and C.

You have two tasks:

  1. Define a numpy array N that contains all maturities between 1 and 10 (hint look at the np.arange function).

  2. Using the equation above, determine the bond prices of all maturity levels in your array.

i = 0.03
M = 100
C = 5

# Define array here

# price bonds here

Exercise 1#

Consider the polynomial expression

$\( p(x) = a_0 + a_1 x + a_2 x^2 + \cdots a_N x^N = \sum_{n=0}^N a_n x^n \tag{1} \)$

Earlier, you wrote a simple function p(x, coeff) to evaluate (9.1) without considering efficiency.

Now write a new function that does the same job, but uses NumPy arrays and array operations for its computations, rather than any form of Python loop.

(Such functionality is already implemented as np.poly1d, but for the sake of the exercise don’t use this class)

  • Hint: Use np.cumprod()

Exercise 2#

Let q be a NumPy array of length n with q.sum() == 1.

Plotting#

Exercise 1#

Plot the function

\[ f(x) = \cos(\pi \theta x) \exp(-x) \]

over the interval \( [0, 5] \) for each \( \theta \) in np.linspace(0, 2, 10).

Place all the curves in the same figure.

The output should look like this

https://python-programming.quantecon.org/_static/lecture_specific/matplotlib/matplotlib_ex1.png

Linear algebra#

Exercises#

Exercise 1

Alice is a stock broker who owns two types of assets: A and B. She owns 100 units of asset A and 50 units of asset B. The current interest rate is 5%. Each of the A assets have a remaining duration of 6 years and pay \\(1500 each year, while each of the B assets have a remaining duration of 4 years and pay \\\)500 each year. Alice would like to retire if she can sell her assets for more than \$500,000. Use vector addition, scalar multiplication, and dot products to determine whether she can retire.

(back to text)

Exercise 2

Which of the following operations will work and which will create errors because of size issues?

Test out your intuitions in the code cell below

x1 @ x2
x2 @ x1
x2 @ x3
x3 @ x2
x1 @ x3
x4 @ y1
x4 @ y2
y1 @ x4
y2 @ x4
# testing area

(back to text)

Exercise 3

Compare the distribution above to the final values of a long simulation.

If you multiply the distribution by 1,000,000 (the number of workers), do you get (roughly) the same number as the simulation?

# your code here

(back to text)

Randomness#

Exercises#

EDIT EXERCISESS

Exercise 1

Wikipedia and other credible statistics sources tell us that the mean and variance of the Uniform(0, 1) distribution are (1/2, 1/12) respectively.

How could we check whether the numpy random numbers approximate these values?

(back to text)

Exercise 2

Some execise

# your code here

(back to text)

Exercise 3

Assume you have been given the opportunity to choose between one of three financial assets:

You will be given the asset for free, allowed to hold it indefinitely, and keeping all payoffs.

Also assume the assets’ payoffs are distributed as follows:

  1. Normal with \( \mu = 10, \sigma = 5 \)

  2. Gamma with \( k = 5.3, \theta = 2 \)

  3. Gamma with \( k = 5, \theta = 2 \)

Use scipy.stats to answer the following questions:

  • Which asset has the highest average returns?

  • Which asset has the highest median returns?

  • Which asset has the lowest coefficient of variation (standard deviation divided by mean)?

  • Which asset would you choose? Why? (Hint: There is not a single right answer here. Be creative and express your preferences)

# your code here

(back to text)

Pandas#

Intro#

Exercises#

Exercise 1

For each of the following exercises, we recommend reading the documentation for help.

  • Display only the first 2 elements of the Series using the .head method.

  • Using the plot method, make a bar plot.

  • Use .loc to select the lowest/highest unemployment rate shown in the Series.

  • Run the code unemp.dtype below. What does it give you? Where do you think it comes from?

(back to text)

Exercise 2

For each of the following, we recommend reading the documentation for help.

  • Use introspection (or google-fu) to find a way to obtain a list with all of the column names in unemp_region.

  • Using the plot method, make a bar plot. What does it look like now?

  • Use .loc to select the the unemployment data for the NorthEast and West for the years 1995, 2005, 2011, and 2015.

  • Run the code unemp_region.dtypes below. What does it give you? How does this compare with unemp.dtype?

(back to text)

BASICS#

Exercises#

Exercise 1

Looking at the displayed DataFrame above, can you identify the index? The columns?

You can use the cell below to verify your visual intuition.

# your code here

(back to text)

Exercise 2

Do the following exercises in separate code cells below:

  • At each date, what is the minimum unemployment rate across all states in our sample?

  • What was the median unemployment rate in each state?

  • What was the maximum unemployment rate across the states in our sample? What state did it happen in? In what month/year was this achieved?

    • Hint 1: What Python type (not dtype) is returned by the aggregation?

    • Hint 2: Read documentation for the method idxmax

  • Classify each state as high or low volatility based on whether the variance of their unemployment is above or below 4.

# min unemployment rate by state
# median unemployment rate by state
# max unemployment rate across all states and Year
# low or high volatility

(back to text)

Exercise 3

Imagine that we want to determine whether unemployment was high (> 6.5), medium (4.5 < x <= 6.5), or low (<= 4.5) for each state and each month.

  1. Write a Python function that takes a single number as an input and outputs a single string noting if that number is high, medium, or low.

  2. Pass your function to applymap (quiz: why applymap and not agg or apply?) and save the result in a new DataFrame called unemp_bins.

  3. (Challenging) This exercise has multiple parts:

  4. Use another transform on unemp_bins to count how many times each state had each of the three classifications.
    - Hint 1: Will this value counting function be a Series or scalar transform?
    - Hint 2: Try googling “pandas count unique value” or something similar to find the right transform.

  5. Construct a horizontal bar chart of the number of occurrences of each level with one bar per state and classification (21 total bars).

  6. (Challenging) Repeat the previous step, but count how many states had each classification in each month. Which month had the most states with high unemployment? What about medium and low?

# Part 1: Write a Python function to classify unemployment levels.
# Part 2: Pass your function from part 1 to applymap
unemp_bins = unemp.applymap#replace this comment with your code!!
# Part 3: Count the number of times each state had each classification.

## then make a horizontal bar chart here
# Part 4: Apply the same transform from part 4, but to each date instead of to each state.

(back to text)

Exercise 4

  • For a single state of your choice, determine what the mean unemployment is during “Low”, “Medium”, and “High” unemployment times (recall your unemp_bins DataFrame from the exercise above).

    • Think about how you would do this for all the states in our sample and write your thoughts… We will soon learn tools that will greatly simplify operations like this that operate on distinct groups of data at a time.

  • Which states in our sample performs the best during “bad times?” To determine this, compute the mean unemployment for each state only for months in which the mean unemployment rate in our sample is greater than 7.

(back to text)

tHE INDEX#

Exercises#

Exercise 1

What happens when you apply the mean method to im_ex_tiny?

In particular, what happens to columns that have missing data? (HINT: also looking at the output of the sum method might help)

(back to text)

Exercise 2

For each of the examples below do the following:

  • Determine which of the rules above applies.

  • Identify the type of the returned value.

  • Explain why the slicing operation returned the data it did.

Write your answers.

wdi.loc[["United States", "Canada"]]
GovExpend Consumption Exports Imports GDP
country year
Canada 2017 0.372665 1.095475 0.582831 0.600031 1.868164
2016 0.364899 1.058426 0.576394 0.575775 1.814016
2015 0.358303 1.035208 0.568859 0.575793 1.794270
2014 0.353485 1.011988 0.550323 0.572344 1.782252
2013 0.351541 0.986400 0.518040 0.558636 1.732714
2012 0.354342 0.961226 0.505969 0.547756 1.693428
2011 0.351887 0.943145 0.492349 0.528227 1.664240
2010 0.347332 0.921952 0.469949 0.500341 1.613543
2009 0.339686 0.890078 0.440692 0.439796 1.565291
2008 0.330766 0.889602 0.506350 0.502281 1.612862
2007 0.318777 0.864012 0.530453 0.498002 1.596876
2006 0.311382 0.827643 0.524461 0.470931 1.564608
2005 0.303043 0.794390 0.519950 0.447222 1.524608
2004 0.299854 0.764357 0.508657 0.416754 1.477317
2003 0.294335 0.741796 0.481993 0.384199 1.433089
2002 0.286094 0.721974 0.490465 0.368615 1.407725
2001 0.279767 0.694230 0.484696 0.362023 1.366590
2000 0.270553 0.677713 0.499526 0.380823 1.342805
United States 2017 2.405743 12.019266 2.287071 3.069954 17.348627
2016 2.407981 11.722133 2.219937 2.936004 16.972348
2015 2.373130 11.409800 2.222228 2.881337 16.710459
2014 2.334071 11.000619 2.209555 2.732228 16.242526
2013 2.353381 10.687214 2.118639 2.600198 15.853796
2012 2.398873 10.534042 2.045509 2.560677 15.567038
2011 2.434378 10.378060 1.978083 2.493194 15.224555
2010 2.510143 10.185836 1.846280 2.360183 14.992053
2009 2.507390 10.010687 1.646432 2.086299 14.617299
2008 2.407771 10.137847 1.797347 2.400349 14.997756
2007 2.351987 10.159387 1.701096 2.455016 15.018268
2006 2.314957 9.938503 1.564920 2.395189 14.741688
2005 2.287022 9.643098 1.431205 2.246246 14.332500
2004 2.267999 9.311431 1.335978 2.108585 13.846058
2003 2.233519 8.974708 1.218199 1.892825 13.339312
2002 2.193188 8.698306 1.192180 1.804105 12.968263
2001 2.112038 8.480461 1.213253 1.740797 12.746262
2000 2.040500 8.272097 1.287739 1.790995 12.620268
wdi.loc[(["United States", "Canada"], [2010, 2011, 2012]), :]
GovExpend Consumption Exports Imports GDP
country year
Canada 2012 0.354342 0.961226 0.505969 0.547756 1.693428
2011 0.351887 0.943145 0.492349 0.528227 1.664240
2010 0.347332 0.921952 0.469949 0.500341 1.613543
United States 2012 2.398873 10.534042 2.045509 2.560677 15.567038
2011 2.434378 10.378060 1.978083 2.493194 15.224555
2010 2.510143 10.185836 1.846280 2.360183 14.992053
wdi.loc["United States"]
GovExpend Consumption Exports Imports GDP
year
2017 2.405743 12.019266 2.287071 3.069954 17.348627
2016 2.407981 11.722133 2.219937 2.936004 16.972348
2015 2.373130 11.409800 2.222228 2.881337 16.710459
2014 2.334071 11.000619 2.209555 2.732228 16.242526
2013 2.353381 10.687214 2.118639 2.600198 15.853796
2012 2.398873 10.534042 2.045509 2.560677 15.567038
2011 2.434378 10.378060 1.978083 2.493194 15.224555
2010 2.510143 10.185836 1.846280 2.360183 14.992053
2009 2.507390 10.010687 1.646432 2.086299 14.617299
2008 2.407771 10.137847 1.797347 2.400349 14.997756
2007 2.351987 10.159387 1.701096 2.455016 15.018268
2006 2.314957 9.938503 1.564920 2.395189 14.741688
2005 2.287022 9.643098 1.431205 2.246246 14.332500
2004 2.267999 9.311431 1.335978 2.108585 13.846058
2003 2.233519 8.974708 1.218199 1.892825 13.339312
2002 2.193188 8.698306 1.192180 1.804105 12.968263
2001 2.112038 8.480461 1.213253 1.740797 12.746262
2000 2.040500 8.272097 1.287739 1.790995 12.620268
wdi.loc[("United States", 2010), ["GDP", "Exports"]]
GDP        14.992053
Exports     1.846280
Name: (United States, 2010), dtype: float64
wdi.loc[("United States", 2010)]
GovExpend       2.510143
Consumption    10.185836
Exports         1.846280
Imports         2.360183
GDP            14.992053
Name: (United States, 2010), dtype: float64
wdi.loc[[("United States", 2010), ("Canada", 2015)]]
GovExpend Consumption Exports Imports GDP
country year
United States 2010 2.510143 10.185836 1.846280 2.360183 14.992053
Canada 2015 0.358303 1.035208 0.568859 0.575793 1.794270
wdi.loc[["United States", "Canada"], "GDP"]
country        year
Canada         2017     1.868164
               2016     1.814016
               2015     1.794270
               2014     1.782252
               2013     1.732714
               2012     1.693428
               2011     1.664240
               2010     1.613543
               2009     1.565291
               2008     1.612862
               2007     1.596876
               2006     1.564608
               2005     1.524608
               2004     1.477317
               2003     1.433089
               2002     1.407725
               2001     1.366590
               2000     1.342805
United States  2017    17.348627
               2016    16.972348
               2015    16.710459
               2014    16.242526
               2013    15.853796
               2012    15.567038
               2011    15.224555
               2010    14.992053
               2009    14.617299
               2008    14.997756
               2007    15.018268
               2006    14.741688
               2005    14.332500
               2004    13.846058
               2003    13.339312
               2002    12.968263
               2001    12.746262
               2000    12.620268
Name: GDP, dtype: float64
wdi.loc["United States", "GDP"]
year
2017    17.348627
2016    16.972348
2015    16.710459
2014    16.242526
2013    15.853796
2012    15.567038
2011    15.224555
2010    14.992053
2009    14.617299
2008    14.997756
2007    15.018268
2006    14.741688
2005    14.332500
2004    13.846058
2003    13.339312
2002    12.968263
2001    12.746262
2000    12.620268
Name: GDP, dtype: float64

(back to text)

Exercise 3

Try setting my_df to some subset of the rows in wdi (use one of the .loc variations above).

Then see what happens when you do wdi / my_df or my_df ** wdi.

Try changing the subset of rows in my_df and repeat until you understand what is happening.

(back to text)

Exercise 4

Below, we create wdi2, which is the same as df4 except that the levels of the index are swapped.

In the cells after df6 is defined, we have commented out a few of the slicing examples from the previous exercise.

For each of these examples, use pd.IndexSlice to extract the same data from df6.

(HINT: You will need to swap the order of the row slicing arguments within the pd.IndexSlice.)

wdi2 = df.set_index(["year", "country"])
# wdi.loc["United States"]
# wdi.loc[(["United States", "Canada"], [2010, 2011, 2012]), :]
# wdi.loc[["United States", "Canada"], "GDP"]

(back to text)

Exercise 5

Use pd.IndexSlice to extract all data from wdiT where the year level of the column names (the second level) is one of 2010, 2012, and 2014

(back to text)

Exercise 6

Look up the documentation for the reset_index method and study it to learn how to do the following:

  • Move just the year level of the index back as a column.

  • Completely throw away all levels of the index.

  • Remove the country of the index and do not keep it as a column.

# remove just year level and add as column
# throw away all levels of index
# Remove country from the index -- don't keep it as a column

(back to text)

Time Series#

Exercises#

Exercise 1

By referring to table found at the link above, figure out the correct argument to pass as format in order to parse the dates in the next three cells below.

Test your work by passing your format string to pd.to_datetime.

christmas_str2 = "2017:12:25"
dbacks_win = "M:11 D:4 Y:2001 9:15 PM"
america_bday = "America was born on July 4, 1776"

(back to text)

Exercise 2

Use pd.to_datetime to express the birthday of one of your friends or family members as a datetime object.

Then use the strftime method to write a message of the format:

NAME's birthday is June 10, 1989 (a Saturday)

(where the name and date are replaced by the appropriate values)

(back to text)

Exercise 3

For each item in the list, extract the specified data from btc_usd:

  • July 2017 through August 2017 (inclusive)

  • April 25, 2015 to June 10, 2016

  • October 31, 2017

(back to text)

Exercise 4

Using the shift function, determine the week with the largest percent change in the volume of trades (the "Volume (BTC)" column).

Repeat the analysis at the bi-weekly and monthly frequencies.

Hint 1: We have data at a daily frequency and one week is 7 days.

Hint 2: Approximate a month by 30 days.

# your code here

(back to text)

Exercise 5

Imagine that you have access to the DeLorean time machine from “Back to the Future”.

You are allowed to use the DeLorean only once, subject to the following conditions:

  • You may travel back to any day in the past.

  • On that day, you may purchase one bitcoin at market open.

  • You can then take the time machine 30 days into the future and sell your bitcoin at market close.

  • Then you return to the present, pocketing the profits.

How would you pick the day?

Think carefully about what you would need to compute to make the optimal choice. Try writing it out in the markdown cell below so you have a clear description of the want operator that we will apply after the exercise.

(Note: Don’t look too far below, because in the next non-empty cell we have written out our answer.)

To make this decision, we want to know …

Your answer here

(back to text)

Exercise 6

Do the following:

  1. Write a pandas function that implements your strategy.

  2. Pass it to the agg method of rolling_btc.

  3. Extract the "Open" column from the result.

  4. Find the date associated with the maximum value in that column.

How much money did you make? Compare with your neighbor.

def daily_value(df):
    # DELETE `pass` below and replace it with your code
    pass

rolling_btc = btc_usd.rolling("30d")

# do steps 2-4 here

(back to text)

Exercise 7

Now suppose you still have access to the DeLorean, but the conditions are slightly different.

You may now:

  • Travel back to the first day of any month in the past.

  • On that day, you may purchase one bitcoin at market open.

  • You can then travel to any day in that month and sell the bitcoin at market close.

  • Then return to the present, pocketing the profits.

To which month would you travel? On which day of that month would you return to sell the bitcoin?

Discuss with your neighbor what you would need to compute to make the optimal choice. Try writing it out in the markdown cell below so you have a clear description of the want operator that we will apply after the exercise.

(Note: Don’t look too many cells below, because we have written out our answer.)

To make the optimal decision we need …

Your answer here

(back to text)

Exercise 8

Do the following:

  1. Write a pandas function that implements your strategy.

  2. Pass it to the agg method of resampled_btc.

  3. Extract the "Open" column from the result.

  4. Find the date associated with the maximum value in that column.

How much money did you make? Compare with your neighbor.

Was this strategy more profitable than the previous one? By how much?

def monthly_value(df):
    # DELETE `pass` below and replace it with your code
    pass

resampled_btc = btc_usd.resample("MS")

# Do steps 2-4 here