Statistical Science

Feature Matching in Time Series Modeling

Yingcun Xia and Howell Tong

Full-text: Open access

Abstract

Using a time series model to mimic an observed time series has a long history. However, with regard to this objective, conventional estimation methods for discrete-time dynamical models are frequently found to be wanting. In fact, they are characteristically misguided in at least two respects: (i) assuming that there is a true model; (ii) evaluating the efficacy of the estimation as if the postulated model is true. There are numerous examples of models, when fitted by conventional methods, that fail to capture some of the most basic global features of the data, such as cycles with good matching periods, singularities of spectral density functions (especially at the origin) and others. We argue that the shortcomings need not always be due to the model formulation but the inadequacy of the conventional fitting methods. After all, all models are wrong, but some are useful if they are fitted properly. The practical issue becomes one of how to best fit the model to data.

Thus, in the absence of a true model, we prefer an alternative approach to conventional model fitting that typically involves one-step-ahead prediction errors. Our primary aim is to match the joint probability distribution of the observable time series, including long-term features of the dynamics that underpin the data, such as cycles, long memory and others, rather than short-term prediction. For want of a better name, we call this specific aim feature matching.

The challenges of model misspecification, measurement errors and the scarcity of data are forever present in real time series modeling. In this paper, by synthesizing earlier attempts into an extended-likelihood, we develop a systematic approach to empirical time series analysis to address these challenges and to aim at achieving better feature matching. Rigorous proofs are included but relegated to the Appendix. Numerical results, based on both simulations and real data, suggest that the proposed catch-all approach has several advantages over the conventional methods, especially when the time series is short or with strong cyclical fluctuations. We conclude with listing directions that require further development.

Article information

Source
Statist. Sci. Volume 26, Number 1 (2011), 21-46.

Dates
First available in Project Euclid: 9 June 2011

Permanent link to this document
https://projecteuclid.org/euclid.ss/1307626560

Digital Object Identifier
doi:10.1214/10-STS345

Mathematical Reviews number (MathSciNet)
MR2849904

Zentralblatt MATH identifier
1219.62142

Keywords
ACF Bayesian statistics black-box models blowflies Box’s dictum calibration catch-all approach ecological populations data mining epidemiology feature consistency feature matching least squares estimation maximum likelihood measles measurement errors misspecified models model averaging multi-step-ahead prediction nonlinear time series observation errors optimal parameter periodicity population models sea levels short time series SIR epidemiological model skeleton substantive models sunspots threshold autoregressive models Whittle’s likelihood XT-likelihood

Citation

Xia, Yingcun; Tong, Howell. Feature Matching in Time Series Modeling. Statist. Sci. 26 (2011), no. 1, 21--46. doi:10.1214/10-STS345. https://projecteuclid.org/euclid.ss/1307626560


Export citation

References

  • Akaike, H. (1978). On the likelihood of a time series model. The Statistician 27 217–235.
  • Alligood, K. T., Sauer, T. D. and Yorke, J. A. (1997). Chaos: An Introduction to Dynamical Systems. Springer, New York.
  • Anderson, R. M. and May, R. M. (1991). Infectious Diseases of Humans: Dynamics and Control. Oxford Univ. Press, Oxford.
  • Bailey, N. T. J. (1957). The Mathematical Theory of Epidemics. Hafner Publishing Co., New York.
  • Bartlett, M. S. (1956). Deterministic and stochastic models for recurrent epidemics. In Proc. Third Berkeley Symp. Math. Statist. Probab. IV 81–109. Univ. California Press, Berkeley.
  • Bartlett, M. S. (1957). Measles periodicity and community size. J. Roy. Statist. Soc. Ser. A 120 48–70.
  • Bartlett, M. S. (1960). The critical Community size for measles in the United States. J. Roy. Statist. Soc. Ser. A 123 37–44.
  • Bhansali, R. J. and Kokoszka, P. S. (2002). Computation of the forecast coefficients for multistep prediction of long-range dependent time series. Int. J. Forecasting 18 181–206.
  • Bjønstad, O. N., Finkenstädt, B. and Grenfell, B. T. (2002). Dynamics of measles epidemics: Estimating scaling of transmission rates using a time series SIR model. Ecological Monographs 72 169–184.
  • Box, G. E. P. (1976). Science and statistics. J. Amer. Statist. Assoc. 71 791–799.
  • Box, G. E. P. and Jenkins, G. M. (1970). Times Series Analysis. Forecasting and Control. Holden-Day, San Francisco, CA.
  • Brockwell, P. J. and Davis, R. A. (1991). Time Series: Theory and Methods, 2nd ed. Springer, New York.
  • Canova, F. (2007). Methods for Applied Macroeconomic Research. Princeton Univ. Press, Princeton.
  • Chan, K.-S. and Tong, H. (2001). Chaos: A Statistical Perspective. Springer, New York.
  • Chan, K.-S., Tong, H. and Stenseth, N. C. (2009). Analyzing short time series data from periodically fluctuating rodent populations by threshold models: A nearest block bootstrap approach (with discussion). Sci. China Ser. A 52 1085–1112.
  • Chen, R., Yang, L. and Hafner, C. (2004). Nonparametric multistep-ahead prediction in time series analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 669–686.
  • Cheng, B. and Tong, H. (1992). On consistent nonparametric order determination and chaos (with discussion). J. Roy. Statist. Soc. Ser. B 54 427–474.
  • Cox, D. R. (1961). Prediction by exponentially weighted moving averages and related methods. J. Roy. Statist. Soc. Ser. B 23 414–422.
  • Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford Statistical Science Series 24. Oxford Univ. Press, Oxford.
  • Dye, C. and Gay, N. (2003). Modeling the SARS epidemic. Science 300 1884–1885.
  • Earn, D. J. D., Rohani, P., Bolker, B. M. and Grenfell, B. T. (2000). A simple model for complex dynamical transitions in epidemics. Science 287 667–670.
  • Ellner, S. P., Seifu, Y. and Smith, R. H. (2002). Fitting population-dynamic models to time-series data by gradient matching. Ecology 83 2256–2270.
  • Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York.
  • Fan, J. and Zhang, W. (2004). Generalised likelihood ratio tests for spectral density. Biometrika 91 195–209.
  • Finkenstädt, B. F. and Grenfell, B. T. (2000). Time series modelling of childhood diseases: A dynamical systems approach. J. Roy. Statist. Soc. Ser. C 49 187–205.
  • Friedlander, B. and Sharman, K. C. (1985). Performance evaluation of the modified Yule-Walker estimator. IEEE Trans. Acoust., Speech, Signal Process. 33 719–725.
  • Georgiou, T. T. (2007). Distances and Riemannian metrics for spectral density functions. IEEE Trans. Signal Process. 55 3995–4003.
  • Glass, K., Xia, Y. and Grenfell, B. T. (2003). Interpreting time-series analyses for continuous-time biological models—Measles as a case study. J. Theoret. Biol. 223 19–25.
  • Grenfell, B. T., Bjørnstad, O. N. and Finkenstädt, B. (2002). Dynamics of measles epidemics: Scaling noise, determinism and predictability with the TSIR model. Ecological Monographs 72 185–202.
  • Guo, M., Bai, Z. and An, H. Z. (1999). Multi-step prediction for nonlinear autoregression models based on empirical distributions. Statist. Sinica 9 559–570.
  • Gurney, W. S. C., Blythe, P. B. and Nisbet, R. M. (1980). Nicholson’s Blowflies revisited. Nature 287 17–21.
  • Hall, A. R. (2005). Generalized Method of Moments. Oxford Univ. Press, Oxford.
  • He, D., Ionides, E. L. and King, A. A. (2010). Plug-and-play inference for disease dynamics: Measles in large and small towns as a case study. J. Roy. Soc. Interface 7 271–283.
  • Hethcote, H. W. (1976). Qualitative analyses of communicable disease models. Math. Biosci. 28 335–356.
  • Isham, V. and Medley, G. (2008). Models for Infectious Human Diseases: Their Structure and Relation to Data. Cambridge Univ. Press, Cambridge.
  • Keeling, M. J. and Grenfell, B. T. (1997). Disease extinction and community size: Modeling the persistence of measles. Science 275 65–67.
  • King, A. A., Iondides, E. L., Pascual, M. and Bouma, M. J. (2008). Inapparent infections and cholera dynamics. Nature 454 877–880.
  • Kydland, F. E. and Prescott, E. C. (1996). The computational experiment: An econometric tool. J. Economic Perspectives 10 69–85.
  • Laneri, K., Bhadra, A., Ionides, E. L., Bouma, M., Yadav, R., Dhiman, R. and Pascual, M. (2010). Forcing versus feedback: Epidemic malaria and monsoon rains in NW India. PLoS Comput. Biol. 6 e1000898.
  • Liu, W. M., Hethcote, H. W. and Levin, S. A. (1987). Dynamical behavior of epidemiological models with nonlinear incidence rates. J. Math. Biol. 25 359–380.
  • Man, K. S. (2002). Long memory time series and short tem forecasts. Int. J. Forecasting 19 477–491.
  • May, R. M. (1976). Simple mathematical models with very complicated dynamics. Nature 261 459–467.
  • Nicholson, A. J. and Bailey, V. A. (1935). The balance of animal populations. Part 1. Proc. Zool. Soc. London 1 551–598.
  • Oster, G. and Ipaktchi, A. (1978). Population cycles. In Periodicitie in Chemistry and Biology ( H. Eyring, ed.) 111–132. Academic Press, New York.
  • Parzen, E. (1962). Stochastic Processes. Holden-Day, San Francisco, CA.
  • Pham, T. D. and Tran, L. T. (1985). Some mixing properties of time series models. Stochastic Process. Appl. 19 297–303.
  • Rohani, P., Green, C. J., Mantilla-Beniers, N. B. and Grenfell, B. T. (2003). Ecological interference between fatal diseases. Nature 422 885–888.
  • Romano, J. P. and Thombs, L. A. (1996). Inference for autocorrelations under weak assumptions. J. Amer. Statist. Assoc. 91 590–600.
  • Sakai, H., Soeda, T. and Tokumaru, H. (1979). On the relation between fitting autoregression and periodogram with applications. Ann. Statist. 7 96–107.
  • Slutsky, E. (1927). The summation of random causes as the source of cyclic processes. Econometrica 5 105–146.
  • Staudenmayer, J. and Buonaccorsi, J. P. (2005). Measurement error in linear autoregressive models. J. Amer. Statist. Assoc. 100 841–852.
  • Stoica, P., Moses, R. L. and Li, J. (1991). Optimal higher-order Yule-Walker estimation of sinusoidal frequencies. IEEE Trans. Signal Process. 39 1360–1368.
  • Stokes, T. G., Gurney, W. S. C., Nisbet, R. M. and Blythe, S. P. (1988). Parameter evolution in a laboratory insect population. Theor. Pop. Biol. 34 248–265.
  • Tiao, G. C. and Xu, D. (1993). Robustness of maximum likelihood estimates for multi-step predictions: The exponential smoothing case. Biometrika 80 623–641.
  • Tong, H. (1990). Nonlinear Time Series: A Dynamical System Approach. Oxford Statistical Science Series 6. Oxford Univ. Press, New York.
  • Tong, H. and Lim, K. S. (1980). Threshold autoregression, limit cycles and cyclical data (with discussion). J. Roy. Statist. Soc. Ser. B 42 245–292.
  • Tsay, R. S. (1992). Model checking via parametric bootstraps in time series analysis. J. Roy. Statist. Soc. Ser. C 41 1–15.
  • Varley, G. C., Gradwell, G. R. and Hassell, M. P. (1973). Insect Population Ecology. Univ. California Press, Berkeley.
  • Walker, A. M. (1960). Some consequences of superimposed error in time series analysis. Biometrika 47 33–43.
  • Whittle, P. (1962). Gaussian estimation in stationary time series. Bull. Inst. Internat. Statist. 39 105–129.
  • Wood, S. N. (2001). Partially specified ecological models. Ecological Monographs 71 1–25.
  • Wu, C. F. J. (1981). Asymptotic theory of nonlinear least squares estimation. Ann. Statist. 9 501–513.
  • Yule, G. U. (1927). On a method of investigating periodicities in disturbed series, with special reference to Wolfer’s sunspot numbers. Philos. Trans. R. Soc. Lond. Ser. A 226 267–298.