Managing Whipsaw Risks by Measuring Potential Changes to Tracking Error

Often when we meet with clients, we often end up in a discussion explaining both the science and the art in portfolio design.  The science is in our model construction and quantitative measures; the art is in the rules we wrap around those measures that result in portfolio allocations.

The example that comes up most frequently is designing for model failure.  When I mention the concept, a lot of people ask for an example.  First let me define what I mean by failure.  When I say model failure, I mean one or more trade decisions that result in underperformance, from either an absolute or relative basis.

With that in mind, the first example I normally give is of a portfolio that is either fully invested in the S&P 500 or completely invested in cash.  If we had a model that provided a signal to make this trading decision, we would require near perfect accuracy in our model to reduce the frequency of bad trades – or whipsaw – which would have a large magnitude impact on portfolio performance.  Therefore, we would never recommend such a portfolio: the design does not facilitate temporary model failure (which is bound to happen from time to time).

Things start to get a little more complicated when you add more securities into the portfolio and the rules become a bit more complex.  As a portfolio designer, we have to have a firm grasp of what sorts of environments we expect our models to do well in and what sorts of environments they may struggle in.  With that knowledge, we can design rules specifically to help manage risk in difficult environments.

Consider our flagship technology: our dynamic, volatility-adjusted momentum model.  Like any momentum model, it is prone to whipsaw and struggles in mean-reversionary environments.  If we consider our above example ("S&P 500 or cash"), a prolonged, volatile and sideways market may generate several whipsaw events that may permanently impair our portfolio.

Considering this scenario, we may first decide to incorporate more signals such that our portfolio requires a confluence of signals before making a dramatic portfolio decision.  For example, instead of investing in the S&P 500, we may break it down into its primary sectors and run our model on each sector, removing sectors with negative momentum and rebalancing available capital into positive momentum sectors (subject to some allocation cap).  In this manner, a single signal turning off does not lead to a dramatic cash position -- several signals are required to flip off before cash is built.

What we are really talking about here is trading signals leading to tracking error and how much tracking error we are willing to take on for a given signal.  I don't like the term tracking error because error has a negative connotation; we should remember that for active managers, tracking error is where objectives are met.  It is also where underperformance may be introduced.

Going back to our initial model, a trade from 100% long to 100% cash is going to create a very, very large tracking error.  Such a large tracking error, in fact, that it would require near-perfect model accuracy to justify.  Now consider if we went from the S&P 500 to a negatively correlated asset (e.g. long-term Treasuries): the tracking error nearly doubles.  Now the accuracy required is astronomical.

On the other hand, removing Utilities from the sector portfolio and rebalancing into the other sectors would not introduce nearly as much tracking error into the portfolio.  Traditionally, momentum trading tends to have many small losses (papercut whipsaw events) and large gains.  This asymmetric payoff profile, in my experience, also manifests itself in performance dispersion: when a trade is a whipsaw, while the sector typically out-performs its peers in the short-term, the overall dispersion is minimum; when a trade is accurate, the dispersion can be very meaningful.  Therefore, rebalancing into the other sectors, which have highly correlated return streams, helps manage whipsaw risk.

But how do we design the overall portfolio?  Do we use sector weights in-line with the S&P 500?  Below I have plotted the relative magnitude of tracking error introduced into such a portfolio, over time, by removing each sector and rebalancing into the others.  What we see is that some sectors are much, much more important in terms of dictating overall performance dispersion from our base-weights (and therefore dictating the value created by our trading signals).  Looking at the graph below, we can be correct on Utilities (XLU) 10/10 times in 2000 and 2001, but a single wrong trade in Technology (XLK) would likely wipe out our profit.  So while we use the same model to drive our investment decisions on all the sectors -- and likely have the same expected long-term accuracy expectations -- the design of the portfolio demands higher accuracy on XLK than XLU.  

Often when we meet with clients, we often end up in a discussion explaining both the science and the art in portfolio design.  The science is in our model construction and quantitative measures; the art is in the rules we wrap around those measures that result in portfolio allocations.

The example that comes up most frequently is designing for model failure.  When I mention the concept, a lot of people ask for an example.  First let me define what I mean by failure.  When I say model failure, I mean one or more trade decisions that result in underperformance, from either an absolute or relative basis.

With that in mind, the first example I normally give is of a portfolio that is either fully invested in the S&P 500 or completely invested in cash.  If we had a model that provided a signal to make this trading decision, we would require near perfect accuracy in our model to reduce the frequency of bad trades – or whipsaw – which would have a large magnitude impact on portfolio performance.  Therefore, we would never recommend such a portfolio: the design does not facilitate temporary model failure (which is bound to happen from time to time).

Things start to get a little more complicated when you add more securities into the portfolio and the rules become a bit more complex.  As a portfolio designer, we have to have a firm grasp of what sorts of environments we expect our models to do well in and what sorts of environments they may struggle in.  With that knowledge, we can design rules specifically to help manage risk in difficult environments.

Consider our flagship technology: our dynamic, volatility-adjusted momentum model.  Like any momentum model, it is prone to whipsaw and struggles in mean-reversionary environments.  If we consider our above example ("S&P 500 or cash"), a prolonged, volatile and sideways market may generate several whipsaw events that may permanently impair our portfolio.

Considering this scenario, we may first decide to incorporate more signals such that our portfolio requires a confluence of signals before making a dramatic portfolio decision.  For example, instead of investing in the S&P 500, we may break it down into its primary sectors and run our model on each sector, removing sectors with negative momentum and rebalancing available capital into positive momentum sectors (subject to some allocation cap).  In this manner, a single signal turning off does not lead to a dramatic cash position -- several signals are required to flip off before cash is built.

What we are really talking about here is trading signals leading to tracking error and how much tracking error we are willing to take on for a given signal.  I don't like the term tracking error because error has a negative connotation; we should remember that for active managers, tracking error is where objectives are met.  It is also where underperformance may be introduced.

Going back to our initial model, a trade from 100% long to 100% cash is going to create a very, very large tracking error.  Such a large tracking error, in fact, that it would require near-perfect model accuracy to justify.  Now consider if we went from the S&P 500 to a negatively correlated asset (e.g. long-term Treasuries): the tracking error nearly doubles.  Now the accuracy required is astronomical.

On the other hand, removing Utilities from the sector portfolio and rebalancing into the other sectors would not introduce nearly as much tracking error into the portfolio.  Traditionally, momentum trading tends to have many small losses (papercut whipsaw events) and large gains.  This asymmetric payoff profile, in my experience, also manifests itself in performance dispersion: when a trade is a whipsaw, while the sector typically out-performs its peers in the short-term, the overall dispersion is minimum; when a trade is accurate, the dispersion can be very meaningful.  Therefore, rebalancing into the other sectors, which have highly correlated return streams, helps manage whipsaw risk.

But how do we design the overall portfolio?  Do we use sector weights in-line with the S&P 500?  Below I have plotted the relative magnitude of tracking error introduced into such a portfolio, over time, by removing each sector and rebalancing into the others.  What we see is that some sectors are much, much more important in terms of dictating overall performance dispersion from our base-weights (and therefore dictating the value created by our trading signals).  Looking at the graph below, we can be correct on Utilities (XLU) 10/10 times in 2000 and 2001, but a single wrong trade in Technology (XLK) would likely wipe out our profit.  So while we use the same model to drive our investment decisions on all the sectors -- and likely have the same expected long-term accuracy expectations -- the design of the portfolio demands higher accuracy on XLK than XLU.  mcte_replication

Now consider a portfolio that equal-weights the sectors throughout time: mcte_ew

We see is a more equal spread among the relative importance in each sector.  All sectors are still not even -- but the accuracy requirement load is certainly more evenly spread.  In fact, we can translate these graphs into effective concentration numbers that tell us how concentrated our weightings are.  While we are invested in 9 holdings, our weights may make it such that we are more heavily concentrated in only 2 or 3 (note: this calculation does not incorporate correlation or covariance; it is simply 1 divided by the sum of squared weights). effective_number_of_bets

What we see is that the equal-weight portfolio is less concentrated than the S&P 500 weighted portfolio nearly the entire time.

Understanding the tracking error implications  – and therefore the accuracy burden on models due to rule construction is critical in effective portfolio design.  It is difficult (if not impossible) to quantify every possible trading decision in all possible portfolio configurations to ensure that an implied tracking error threshold isn't broken.  Instead, we have to intuitively understand the relationships between asset classes and understand how the portfolio can find itself in situations where it is making large tracking-error trading decisions and make sure we are comfortable, in those scenarios, that the large tracking-error swings are warranted and the accuracy of our models is high enough in those environments.  That is the art of portfolio design.

Follow up thoughts for a later research: can we construct a portfolio weighting such that all potential trading possibilities have equal tracking error implications?  E.g. can we take our equal sector weight example and find weights such that the tracking error concentration is equally spread?  Is it worth the hassle for the estimation risk we take on with measuring asset covariances?

NOTE:  All analysis in this post is hypothetical and backtested.

Now consider a portfolio that equal-weights the sectors throughout time:

We see is a more equal spread among the relative importance in each sector.  All sectors are still not even -- but the accuracy requirement load is certainly more evenly spread.  In fact, we can translate these graphs into effective concentration numbers that tell us how concentrated our weightings are.  While we are invested in 9 holdings, our weights may make it such that we are more heavily concentrated in only 2 or 3 (note: this calculation does not incorporate correlation or covariance; it is simply 1 divided by the sum of squared weights).

What we see is that the equal-weight portfolio is less concentrated than the S&P 500 weighted portfolio nearly the entire time.

Understanding the tracking error implications  – and therefore the accuracy burden on models due to rule construction is critical in effective portfolio design.  It is difficult (if not impossible) to quantify every possible trading decision in all possible portfolio configurations to ensure that an implied tracking error threshold isn't broken.  Instead, we have to intuitively understand the relationships between asset classes and understand how the portfolio can find itself in situations where it is making large tracking-error trading decisions and make sure we are comfortable, in those scenarios, that the large tracking-error swings are warranted and the accuracy of our models is high enough in those environments.  That is the art of portfolio design.

Follow up thoughts for a later research: can we construct a portfolio weighting such that all potential trading possibilities have equal tracking error implications?  E.g. can we take our equal sector weight example and find weights such that the tracking error concentration is equally spread?  Is it worth the hassle for the estimation risk we take on with measuring asset covariances?

NOTE:  All analysis in this post is hypothetical and backtested.