Abstract: We derive generalization error bounds for traditional time- series forecasting models. Our results hold for many standard forecasting tools including autoregressive models, moving average models, and, more generally, linear state-space models. These non-asymptotic bounds need only weak assumptions on the data-generating process, yet allow forecasters to select among competing models and to guarantee, with high probability, that their chosen model will perform well. We motivate our techniques with and apply them to standard economic and financial forecasting tools—a GARCH model for predicting equity volatility and a dynamic stochastic general equilibrium model (DSGE), the standard tool in macroeconomic forecasting. We demonstrate in particular how our techniques can aid forecasters and policy makers in choosing models which behave well under uncertainty and mis-specification.
Conclusion: This paper demonstrates how to control the generalization error of common time-series forecasting models, especially those used in economics and engineering—ARMA models, vector autoregressions (Bayesian or otherwise), linearized dynamic stochastic general equilibrium models, and linear state-space models. We derive upper bounds on the risk, which hold with high probability while requiring only weak assumptions on the data-generating process. These bounds are finite sample in nature, unlike standard model selection penalties such as AIC or BIC. Furthermore, they do not suffer the biases inherent in other risk estimation techniques such as the pseudo-cross validation approach often used in the economic forecasting literature.
While we have stated these results in terms of standard economic forecasting models, they have very wide applicability. Theorem 12 applies to any forecasting procedure with fixed memory length, linear or non-linear. Theorem 17 applies only to methods whose forecasts are linear in the observations, but a similar result for nonlinear methods would just need to ensure that the dependence of the forecast on the past decays in some suitable way. Rather than deriving bounds theoretically, one could attempt to estimate bounds on the risk. While cross-validation is tricky (Racine, 2000), nonparametric bootstrap procedures may do better. A fully nonparametric version is possible, using the circular bootstrap (reviewed in Lahiri, 1999). Bootstrapping lengthy out-of-sample sequences for testing fitted model predictions yields intuitively sensible estimates of Rn(f), but there is currently no theory about the coverage level. Also, while models like VARs can be fit quickly to simulated data, general state-space models, let alone DSGEs, require large amounts of computational power, which is an obstacle to any resampling method.