The Annals of Statistics

Factor modeling for high-dimensional time series: Inference for the number of factors

Clifford Lam and Qiwei Yao

Full-text: Open access

Abstract

This paper deals with the factor modeling for high-dimensional time series based on a dimension-reduction viewpoint. Under stationary settings, the inference is simple in the sense that both the number of factors and the factor loadings are estimated in terms of an eigenanalysis for a nonnegative definite matrix, and is therefore applicable when the dimension of time series is on the order of a few thousands. Asymptotic properties of the proposed method are investigated under two settings: (i) the sample size goes to infinity while the dimension of time series is fixed; and (ii) both the sample size and the dimension of time series go to infinity together. In particular, our estimators for zero-eigenvalues enjoy faster convergence (or slower divergence) rates, hence making the estimation for the number of factors easier. In particular, when the sample size and the dimension of time series go to infinity together, the estimators for the eigenvalues are no longer consistent. However, our estimator for the number of the factors, which is based on the ratios of the estimated eigenvalues, still works fine. Furthermore, this estimation shows the so-called “blessing of dimensionality” property in the sense that the performance of the estimation may improve when the dimension of time series increases. A two-step procedure is investigated when the factors are of different degrees of strength. Numerical illustration with both simulated and real data is also reported.

Article information

Source
Ann. Statist. Volume 40, Number 2 (2012), 694-726.

Dates
First available in Project Euclid: 17 May 2012

Permanent link to this document
http://projecteuclid.org/euclid.aos/1337268209

Digital Object Identifier
doi:10.1214/12-AOS970

Mathematical Reviews number (MathSciNet)
MR2933663

Zentralblatt MATH identifier
1273.62214

Subjects
Primary: 62M10: Time series, auto-correlation, regression, etc. [See also 91B84] 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Secondary: 60G99: None of the above, but in this section

Keywords
Autocovariance matrices blessing of dimensionality eigenanalysis fast convergence rates multivariate time series ratio-based estimator strength of factors white noise

Citation

Lam, Clifford; Yao, Qiwei. Factor modeling for high-dimensional time series: Inference for the number of factors. Ann. Statist. 40 (2012), no. 2, 694--726. doi:10.1214/12-AOS970. http://projecteuclid.org/euclid.aos/1337268209.


Export citation

References

  • [1] Anderson, T. W. (1963). The use of factor analysis in the statistical analysis of multiple time series. Psychometrika 28 1–25.
  • [2] Bai, J. and Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica 70 191–221.
  • [3] Bai, J. and Ng, S. (2007). Determining the number of primitive shocks in factor models. J. Bus. Econom. Statist. 25 52–60.
  • [4] Bathia, N., Yao, Q. and Ziegelmann, F. (2010). Identifying the finite dimensionality of curve time series. Ann. Statist. 38 3352–3386.
  • [5] Brillinger, D. R. (1981). Time Series: Data Analysis and Theory, 2nd ed. Holden-Day, Oakland, CA.
  • [6] Chamberlain, G. and Rothschild, M. (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econometrica 51 1281–1304.
  • [7] Fan, J. and Yao, Q. (2003). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer, New York.
  • [8] Forni, M., Hallin, M., Lippi, M. and Reichlin, L. (2000). The generalized dynamic-factor model: Identification and estimation. The Review of Economics and Statistics 82 540–554.
  • [9] Hallin, M. and Liška, R. (2007). Determining the number of factors in the general dynamic factor model. J. Amer. Statist. Assoc. 102 603–617.
  • [10] Hannan, E. J. (1970). Multiple Time Series. Wiley, New York.
  • [11] Horn, R. A. and Johnson, C. R. (1991). Topics in Matrix Analysis. Cambridge Univ. Press, Cambridge.
  • [12] Lam, C. and Yao, Q. (2012). Supplement to “Factor modeling for high-dimensional time series: Inference for the number of factors.” DOI:10.1214/12-AOS970SUPP.
  • [13] Lam, C., Yao, Q. and Bathia, N. (2011). Estimation of latent factors for high-dimensional time series. Biometrika 98 901–918.
  • [14] Lütkepohl, H. (1993). Introduction to Multiple Time Series Analysis, 2nd ed. Springer, Berlin.
  • [15] Pan, J., Peña, D., Polonik, W. and Yao, Q. (2011). Modelling multivariate volatilities via common factors. Available at http://stats.lse.ac.uk/q.yao/qyao.links/paper/pppy.pdf.
  • [16] Pan, J. and Yao, Q. (2008). Modelling multiple time series via common factors. Biometrika 95 365–379.
  • [17] Péché, S. (2009). Universality results for the largest eigenvalues of some sample covariance matrix ensembles. Probab. Theory Related Fields 143 481–516.
  • [18] Peña, D. and Box, G. E. P. (1987). Identifying a simplifying structure in time series. J. Amer. Statist. Assoc. 82 836–843.
  • [19] Peña, D. and Poncela, P. (2006). Nonstationary dynamic factor analysis. J. Statist. Plann. Inference 136 1237–1257.
  • [20] Priestley, M. B. (1981). Spectral Analysis and Time Series. Academic Press, New York.
  • [21] Priestley, M. B., Rao, T. S. and Tong, H. (1974). Applications of principal component analysis and factor analysis in the identification of multivariable systems. IEEE Trans. Automat. Control 19 703–704.
  • [22] Reinsel, G. C. (1997). Elements of Multivariate Time Series Analysis, 2nd ed. Springer, New York.
  • [23] Shapiro, D. E. and Switzer, P. (1989). Extracting time trends from multiple monitoring sites. Technical Report 132, Dept. Statistics, Stanford Univ.
  • [24] Stewart, G. W. (1973). Error and perturbation bounds for subspaces associated with certain eigenvalue problems. SIAM Rev. 15 727–764.
  • [25] Switzer, P. and Green, A. A. (1984). Min/Max autocorrelation factors for multivariate spatial imagery. Technical Report 6, Dept. Statistics, Stanford Univ.
  • [26] Tao, M., Wang, Y., Yao, Q. and Zou, J. (2011). Large volatility matrix inference via combining low-frequency and high-frequency approaches. J. Amer. Statist. Assoc. 106 1025–1040.
  • [27] Tiao, G. C. and Tsay, R. S. (1989). Model specification in multivariate time series (with discussion). J. Roy. Statist. Soc. Ser. B 51 157–213.
  • [28] Wang, H. (2010). Factor profiling for ultra high dimensional variable selection. Available at http://ssrn.com/abstract=1613452.

Supplemental materials