A Note on the Validity of Cross-Validation for Evaluating Autoregressive Time Series Prediction

Um bom artigo sobre a aplicação de Cross Validation em séries temporais.

Abstract: One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not straightforward and often omitted by practitioners in favour of an out-of-sample (OOS) evaluation. In this paper, we show that in the case of a purely autoregressive model, the use of standard K-fold CV is possible as long as the models considered have uncorrelated errors. Such a setup occurs, for example, when the models nest a more appropriate model. This is very common when Machine Learning methods are used for prediction, where CV in particular is suitable to control for overfitting the data. We present theoretical insights supporting our arguments. Furthermore, we present a simulation study and a real-world example where we show empirically that K-fold CV performs favourably compared to both OOS evaluation and other time-series-specific techniques such as non-dependent cross-validation.

Conclusions: In this work we have investigated the use of cross-validation procedures for time series prediction evaluation when purely autoregressive models are used, which is a very common situation; e.g., when using Machine Learning procedures for time series forecasting. In a theoretical proof, we have shown that a normal K-fold cross-validation procedure can be used if the residuals of our model are uncorrelated, which is especially the case if the model nests an appropriate model. In the Monte Carlo experiments, we have shown empirically that even if the lag structure is not correct, as long as the data are fitted well by the model, cross-validation without any modification is a better choice than OOS evaluation. We have then in a real-world data example shown how these findings can be used in a practical situation. Cross-validation can adequately control overfitting in this application, and only if the models underfit the data and lead to heavily correlated errors, are the cross-validation procedures to be avoided as in such a case they may yield a systematic underestimation of the error. However, this case can be easily detected by checking the residuals for serial correlation, e.g., using the Ljung-Box test.

cv-wp

Anúncios
A Note on the Validity of Cross-Validation for Evaluating Autoregressive Time Series Prediction

Cross-Validation, e a estimativa da estimativa

No ótimo post do blog do Andrew Gelman em que o título é “Cross-Validation != Magic” tem uma observação muito importante sobre a definição do Cross-Validation (Validação Cruzada) que é:

“2. Cross-validation is a funny thing. When people tune their models using cross-validation they sometimes think that because it’s an optimum that it’s the best. Two things I like to say, in an attempt to shake people out of this attitude:

(a) The cross-validation estimate is itself a statistic, i.e. it is a function of data, it has a standard error etc.

(b) We have a sample and we’re interested in a population. Cross-validation tells us what performs best on the sample, or maybe on the hold-out sample, but our goal is to use what works best on the population. A cross-validation estimate might have good statistical properties for the goal of prediction for the population, or maybe it won’t.

Just cos it’s “cross-validation,” that doesn’t necessarily make it a good estimate. An estimate is an estimate, and it can and should be evaluated based on its statistical properties. We can accept cross-validation as a useful heuristic for estimation (just as Bayes is another useful heuristic) without buying into it as necessarily best.

Cross-Validation é um assunto bem espinhoso quando se trata de amostragem e/ou estimação de modelos devido ao fato de que há diversas opiniões a favor e contra.

Conhecer as propriedades de cada método de amostragem e saber as suas propriedades matemáticas/estatísticas, vantagens e desvantagens e principalmente limitações é regra número 1 para qualquer data miner.

Particularmente, eu vejo o Cross-Validation como um método excelente quando se tem um universo de dados restrito (poucas instâncias treinamento), ou mesmo quando faço as validações com método normal de 80-10-10 de amostragem; mas isso é mais uma heurística de trabalho do que uma regra propriamente dita .

Cross-Validation, e a estimativa da estimativa

CART e Cross-Validation

Neste vídeo da Salford Systems há um pequeno trecho de uma palestra feita em 2004.

A validação cruzada é um tema que gera algumas controvérsias, mas querendo ou não para quem realiza experimentos com bases de dados com menos de 50.000 registros (é um número cabalístico, mas ainda serve) pode ser a saída para bons resultados.

Uma ótima (na verdade é a melhor) referência sobre esse assunto e as vantagens e desvantagens está no livro de HASTIE, Trevor et al. The elements of statistical learning que pode ser baixado aqui.

CART e Cross-Validation