Strategy Evaluation
14. Strategy Evaluation#
How to evaluate if a strategy is any good?
The first answer might be look at the strategy Sharpe ratio,
Which measure how much premium you per unit of volaitlity, aka risk.
Statistics: Just becasue a strategy had a high return and/or low volatility in a particular sample does not immediately mean that this is a good forecast of the asset behavior going forward. As we saw in our “Estimation Uncertainty” chaper, these estiamtes can move quite a lot and be quite uninformative about the future. We will use statistics to document what are reliable feautres of the data, i.e. features that are unlikely to be due to pure chance.
Comovement: While this looking at Sharpe Rations is a sensible way to evalaute your overall portfolio, it is not sensible way to evaluate any individual strategy/asset given that correlations will mean that two strategies with exactly the same Sharpe ratio can be of different value to you depending on the correlation with your overall portfolio.
Statistical Biases: We have to be very careful how we interpret statistical tests as the data that we look at is always heavily selected in some way. Making our statistical tests theoretical power not really true. So we need to be aware of a whole host of biases when looking at the data. A few examples include:
Survivorship bias: An asset have a very high average return in the sample by the mere fact that it didn’t crash before. For example, most of the equity markets in the world collapsed multiple times so the US market looks great becasue it survived, but this could be due only to chance because an investor in the 1900 wouldn’t know that the US equity market would be the one surviving. So when we conditioning on the US market, we are implicitly usign future information since we don’t invest in german/france/japanese stocks before the WWII. This can impact Crypto and other new asset classes particularly severely.
Publication/Famous bias: researches/practicioners are always looking for strategies that work in past data. Some of these strategies will work just by chance because returns are so noisy and the data not long enough. The stuff that will get published is the stuff that has worked in the data becasue the journals will not publishe the “null results”, i.e. no one will publish a strategy that does NOT work. A more general bias of this same form is the famous bias. For example we only talk about Warren Buffet or Bitcoin because they did well in the past. But for every Warren buffet there is also an unsuccessful manager that lost everything. And for every bitcoin there is abother Crypto currency that collapsed. Unless you know how to pick Warren Buffet/Bitcoin before they became famous the strategy of investing in Warrent Buffet or buying Bitcoin and using the whole data to evalaute the performance is not really valid.
There are many more. The fundamental issue is that our statistics do not account for the “discovery process” of an asset/strategy/manager.
Tail risks: Another important issue relates to higher moments (tail risks) of the strategy. An investor that has mean-variance preferences would not care about those. But who has mean-variance preferences? MV preferences are a good approximation to decide how to invest at broadly diversified portfolios at reasonable horizons (months) when returns are pretty well behaved. But it is fair to say that between two strategies with the same sharpe ratio most investors will prefer one that has less extreme negative returns.
Background risks: An important aspect for many investors is to understand the exact macro-economic conditions of the periods the strategy performs poorly. We have to be humble here because the data will typically not allow us to say something overly precise, but investors will typically care about the specific scenario where the strategy does poorly: Does it pay poorly in recessions? Financial crisis ? High inflation periods? low growth periods?wars?
We will go over these in the next sections. Points 1 and 2 we will adress together in the econometrics of strategy evaluation, we will then discuss point 3 in Out of Sample analysis, and points 4 and 5 in Tail risks and Background risks (TBD).