The Annals of Statistics

On model selection from a finite family of possibly misspecified time series models

Hsiang-Ling Hsu, Ching-Kang Ing, and Howell Tong

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Consider finite parametric time series models. “I have $n$ observations and $k$ models, which model should I choose on the basis of the data alone” is a frequently asked question in many practical situations. This poses the key problem of selecting a model from a collection of candidate models, none of which is necessarily the true data generating process (DGP). Although existing literature on model selection is vast, there is a serious lacuna in that the above problem does not seem to have received much attention. In fact, existing model selection criteria have avoided addressing the above problem directly, either by assuming that the true DGP is included among the candidate models and aiming at choosing this DGP, or by assuming that the true DGP can be asymptotically approximated by an increasing sequence of candidate models and aiming at choosing the candidate having the best predictive capability in some asymptotic sense. In this article, we propose a misspecification-resistant information criterion (MRIC) to address the key problem directly. We first prove the asymptotic efficiency of MRIC whether the true DGP is among the candidates or not, within the fixed-dimensional framework. We then extend this result to the high-dimensional case in which the number of candidate variables is much larger than the sample size. In particular, we show that MRIC can be used in conjunction with a high-dimensional model selection method to select the (asymptotically) best predictive model across several high-dimensional misspecified time series models.

Article information

Ann. Statist., Volume 47, Number 2 (2019), 1061-1087.

Received: November 2017
Revised: March 2018
First available in Project Euclid: 11 January 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 63M30
Secondary: 62F07: Ranking and selection 62F12: Asymptotic properties of estimators

AIC BIC misspecification-resistant information criterion multistep prediction error high-dimensional misspecified models orthogonal greedy algorithm


Hsu, Hsiang-Ling; Ing, Ching-Kang; Tong, Howell. On model selection from a finite family of possibly misspecified time series models. Ann. Statist. 47 (2019), no. 2, 1061--1087. doi:10.1214/18-AOS1706.

Export citation


  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Control 19 716–723.
  • Akaike, H. (1978). On the likelihood of a time series model. J. Roy. Statist. Soc. Ser. D 217–235.
  • Basu, S. and Michailidis, G. (2015). Regularized estimation in sparse high-dimensional time series models. Ann. Statist. 43 1535–1567.
  • Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • Bozdogan, H. (2000). Akaike’s information criterion and recent developments in information complexity. J. Math. Psych. 44 62–91.
  • Burnham, K. P. and Anderson, D. R. (2002). Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd ed. Springer, New York.
  • Chan, N. H. and Ing, C.-K. (2011). Uniform moment bounds of Fisher’s information with applications to time series. Ann. Statist. 39 1526–1550.
  • Davies, P. L. (2008). Approximating data (with discussion). J. Korean Statist. Soc. 37 191–240.
  • Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 849–911.
  • Findley, D. F. (1991). Counterexamples to parsimony and BIC. Ann. Inst. Statist. Math. 43 505–514.
  • Findley, D. F. and Wei, C. Z. (1993). Moment bounds for deriving time series CLTs and model selection procedures. Statist. Sinica 3 453–480.
  • Findley, D. F. and Wei, C.-Z. (2002). AIC, overfitting principles, and the boundedness of moments of inverse matrices for vector autoregressions and related models. J. Multivariate Anal. 83 415–450.
  • Greenaway-McGrevy, R. (2013). Multistep prediction of panel vector autoregressive processes. Econometric Theory 29 699–734.
  • Greenaway-McGrevy, R. (2015). Evaluating panel data forecasts under independent realization. J. Multivariate Anal. 136 108–125.
  • Hsu, H.-L., Ing, C.-K. and Tong, H. (2019). Supplement to “On model selection from a finite family of possibly misspecified time series models.” DOI:10.1214/18-AOS1706SUPP.
  • Ing, C.-K. (2003). Multistep prediction in autoregressive processes. Econometric Theory 19 254–279.
  • Ing, C.-K. (2007). Accumulated prediction errors, information criteria and optimal forecasting for autoregressive time series. Ann. Statist. 35 1238–1277.
  • Ing, C.-K. and Lai, T. L. (2011). A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Statist. Sinica 21 1473–1513.
  • Ing, C.-K. and Wei, C.-Z. (2003). On same-realization prediction in an infinite-order autoregressive process. J. Multivariate Anal. 85 130–155.
  • Ing, C.-K. and Wei, C.-Z. (2005). Order selection for same-realization predictions in autoregressive processes. Ann. Statist. 33 2423–2474.
  • Inoue, A. and Kilian, L. (2006). On the selection of forecasting models. J. Econometrics 130 273–306.
  • Konishi, S. and Kitagawa, G. (1996). Generalised information criteria in model selection. Biometrika 83 875–890.
  • Li, K.-C. (1987). Asymptotic optimality for $C_{p}$, $C_{L}$, cross-validation and generalized cross-validation: Discrete index set. Ann. Statist. 15 958–975.
  • Liu, W. and Yang, Y. (2011). Parametric or nonparametric? A parametricness index for model selection. Ann. Statist. 39 2074–2102.
  • Lv, J. and Liu, J. S. (2014). Model selection principles in misspecified models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 141–167.
  • Mallows, C. L. (1973). Some comments on $C_{p}$. Technometrics 15 661–675.
  • Nishii, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression. Ann. Statist. 12 758–765.
  • Rao, C. R. and Wu, Y. H. (1989). A strongly consistent procedure for model selection in a regression problem. Biometrika 76 369–374.
  • Schorfheide, F. (2005). VAR forecasting under misspecification. J. Econometrics 128 99–136.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • Shao, J. (1997). An asymptotic theory for linear model selection. Statist. Sinica 7 221–264.
  • Shibata, R. (1976). Selection of the order of an autoregressive model by Akaike’s information criterion. Biometrika 63 117–126.
  • Shibata, R. (1980). Asymptotically efficient selection of the order of the model for estimating parameters of a linear process. Ann. Statist. 8 147–164.
  • Shibata, R. (1981). An optimal selection of regression variables. Biometrika 68 45–54.
  • Sin, C.-Y. and White, H. (1996). Information criteria for selecting possibly misspecified parametric models. J. Econometrics 71 207–225.
  • Stone, C. J. (1977). Consistent nonparametric regression. Ann. Statist. 5 595–645.
  • Takeuchi, K. (1976). The distribution of information statistic and the criterion of the adequacy of a model. Suri-Kagaku (Mathematical Sciences) 3 12–18.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • van Erven, T., Grünwald, P. and de Rooij, S. (2012). Catching up faster by switching sooner: A predictive approach to adaptive estimation with an application to the AIC–BIC dilemma. J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 361–417.
  • Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. J. Amer. Statist. Assoc. 104 1512–1524.
  • Wei, C. Z. (1992). On predictive least squares principles. Ann. Statist. 20 1–42.
  • Wu, W.-B. and Wu, Y. N. (2016). Performance bounds for parameter estimates of high-dimensional linear models with correlated errors. Electron. J. Stat. 10 352–379.
  • Xia, Y. and Tong, H. (2011). Feature matching (with discussion). Statist. Sci. 26 21–46.
  • Yang, Y. (2007). Prediction/estimation with simple linear models: Is it really that simple? Econometric Theory 23 1–36.
  • Zhang, Y. and Yang, Y. (2015). Cross-validation for selecting a model selection procedure. J. Econometrics 187 95–112.

Supplemental materials

  • Supplement to “On model selection from a finite family of possibly misspecified time series models”. The supplementary material contains the proofs of all theorems, an extension of MRIC to a class of nonlinear models and simulation studies and real data analysis to illustrate the performance of the proposed methods in both low- and high-dimensional cases.