## 3 things we should see in tomorrow’s macroeconometric modelling

Macroeconometric modelling is a funny sort of field. The stakes of doing it wrong (or right) are extremely high, while the data used are infrequently and poorly measured. In many cases, estimating a model on a large-enough sample to do useful inference involves including observations from a long time in the past—I’m talking the 60s and 70s—and believing that the data were both correctly measured and coming from the *same economy*.

A skeptical macroeconometrician may ask: “how much of my view about how the world works today should I inform using data points from the 1960s, 70s, or 80s?” and they’d have a good point.

Here’s a field where it’s basically impossible to know *anything*—at least to any scientific standard—that has enormous impact and policy relevance. It’s really no wonder that it attracts a ‘spectrum of personalities’, vying with one another for the ears of our political leaders.

At the same time, macroeconometrics done right is useful. There is not nothing to be learned from history, so long as macroeconometricians are honest about what can and cannot be learned from historical data.

To these ends, I thought I’d put together a list of 3 characteristics that we should expect in tomorrow’s empirical macro models, with a few notes on how to implement them. All of these exist already, but are not standard features of commonly used empirical macro models.

**1. Model uncertainty and sensible confidence intervals**

Most readers here would expect any forecast to come with forecast confidence intervals, normally 95%. The implication to the reader is that the forecaster is “95% sure” that future values will fall inside the confidence band. An alternative interpretation may be that “95% of possible futures” fall inside the confidence band.

Almost all of the time, these confidence bands are poorly constructed, resulting in the reader being *too sure* about the future. This is because confidence intervals constructed the usual way—using historical forecast errors—assume that the underlying economic model is true. That is, using the normal approach, a 95% confidence band contains 95% of potential futures **given the underlying economic model is a perfect representation of the** **world**.

Of course, economic models are not perfect representations of the world, and so the 95% confidence band here is useless. I doubt highly that if the Australian Treasury had used their current technique of constructing confidence intervals over the last decade that such confidence bands would have included 95% of the realised outcomes.

Introducing model uncertainty—uncertainty over how well the model actually represents the world—helps to overcome this. There are ways of introducing *‘*model uncertainty’ to a macro model, often by bootstrapping (which I have issues with, as I don’t believe that historical data come from the same model), and more commonly by using Bayesian techniques, using priors that reflect how little we actually know. These tools are used quite frequently by many macro modellers, though, unfortunately, not many who matter.

**2. Coherent weighting schemes/model shifts**

When building an empirical macro model, often one of the most difficult choices is how much data to include. Macroeconomic data are recorded fairly infrequently—monthly for unemployment and trade, quarterly for prices and the national accounts, annually for state accounts, etc. Many of these series don’t really move about too much, which makes it difficult to pin down the relationships between macro variables. This means that the empirical macroeconomist often needs to estimate their models on long histories.

This is a tough choice: include a long history and you end up estimating a useless value (the average relationship between variables for the whole period, rather than their relationship today), include a short history and you end up throwing out a lot of data that may have value. One common work-around is to use a weighting scheme that gives more importance to recent observations, and less importance to historical observations. But is the recent past really a better predictor than the distant past? Can we learn nothing from history?

Once you’re in the world of time-series modelling, you’re implicitly saying that relationships between historical variables are of some use. **If this is the case, then why not go the whole way and say that more can be learned from more relevant histories? **

One fairly simple way of doing this is to simply give more weight to relevant histories when we build our models. But how do you know which histories are relevant and which are not? My method is to do the following:

1. Train a random forest on the relevant dependent variable, using a wide range of independent variables. The random forest is a tool from machine learning that will throw out irrelevent independent variables, so you can afford to put many in.

2. Save the proximity matrix from the random forest. This symmetric matrix gives us a measure of similarity between two observations. Importantly, it is how similar two observations are *in all the ways that matter to predict the dependent variable*. I have written on this elsewhere; I consider it to be one of the most important tools to the future of inferential economic research.

Here is the first five rows and columns of a proximity matrix from the demonstration below.

1979Q3 | 1979Q4 | 1980Q1 | 1980Q2 | 1980Q3 | |

1979Q3 | 1 | 0.240876 | 0.164179 | 0.19708 | 0.402878 |

1979Q4 | 0.240876 | 1 | 0.169355 | 0.212598 | 0.222222 |

1980Q1 | 0.164179 | 0.169355 | 1 | 0.132231 | 0.103704 |

1980Q2 | 0.19708 | 0.212598 | 0.132231 | 1 | 0.115108 |

1980Q3 | 0.402878 | 0.222222 | 0.103704 | 0.115108 | 1 |

3. Run your regression model, taking the appropriate row of the proximity matrix to be the weighting vector. This will normally be the last row, as you’re interested in finding similar histories to *today*.

It’s really that simple.

So how much of the data actually gets used in this method? To illustrate, I’ve put together a little demo (code and data—which downloads automatically on running the script—available here). In it, I’m trying to model labour productivity growth, in particular, how much it appears to be affected by changes to unemployment. Note that this is for illustrative purposes, and I’m not making any claims about whether the parameter is well identified.

The figure below illustrates the weights that are being given to historical data observations, along with the fitted values and predicted values.

If we use this method, we can see how the relationship between changes in unemployment and changes in productivity vary over time, when we give more weight to relevant histories. The line in the middle is what we would estimate today, if we gave equal weight to all observations. As we can see, in some histories, changes to productivity do appear to move together with changes in unemployment.

These charts wrap up my spiel on using relevant histories, though I’ll probably write some more on it in the future.

**3. The ability to inform the user when the model should not be used**

One of the major shortcomings of macro models today is that they lack an intuitive way of knowing when a forecast or policy simulation should not be performed *because the model was estimated on data from a different world*. Instead, a model is typically just a bunch of coefficients (or occasionally distributions) that we multiply with our hypothetical x variables. It doesn’t care whether those x variables are nothing like the ones that the model was estimated on.

This is one of the big areas of abuse of models, sometimes with catastrophic consequences. We estimate a model on the good times, and wonder why it doesn’t work during the bad. Wouldn’t it be wonderful if the model just reported enormous confidence bands whenever it was being asked to do something unreasonable?

Well, this can be done too, using the weighting scheme discussed above. If we’re in the middle of an unusual economy, then there will be very few histories proximate to the present, and confidence intervals can be adjusted accordingly (as the model is effectively being estimated on fewer data-points). On the other hand, if we’re estimating the behaviour of a fairly regular economy, we have lots of relevant histories and our confidence intervals will be smaller.

—–

There are many things we should be asking of our macroeconomic modellers. My top three are that they appropriately model what they do not know, that they stop building models on useless data, and that they do not use their models out of context. That’s not too much to ask.

## Matt C said,

July 30, 2014 @ 4:37 am

Hi Jim,

Great post. This (and other stuff you’ve written) has convinced me that I need to learn a lot more about quantitative techniques that have emerged from machine learning.

## khakieconomist said,

July 30, 2014 @ 2:55 pm

Thanks Matt – have you tried any of the coursera courses?

I found working (slowly) through this paper has been extremely helpful.

http://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf

After that one, Brieman’s (PBUHN) outline of random forests is great.

http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

## Dave said,

July 30, 2014 @ 7:05 pm

There is a very accessible Coursera course called Practical Machine Learning that I am auditing atm. Provides a good overview of the different techniques and some easy to use R pacakges (mainly CARET) to implement them. I’ve found it to be a good intro so far, giving enough info to run some simple models without going into the weeds too much.