The Annals of Statistics

Nonlinear principal components and long-run implications of multivariate diffusions

Xiaohong Chen, Lars Peter Hansen, and José Scheinkman

Full-text: Open access


We investigate a method for extracting nonlinear principal components (NPCs). These NPCs maximize variation subject to smoothness and orthogonality constraints; but we allow for a general class of constraints and multivariate probability densities, including densities without compact support and even densities with algebraic tails. We provide primitive sufficient conditions for the existence of these NPCs. By exploiting the theory of continuous-time, reversible Markov diffusion processes, we give a different interpretation of these NPCs and the smoothness constraints. When the diffusion matrix is used to enforce smoothness, the NPCs maximize long-run variation relative to the overall variation subject to orthogonality constraints. Moreover, the NPCs behave as scalar autoregressions with heteroskedastic innovations; this supports semiparametric identification and estimation of a multivariate reversible diffusion process and tests of the overidentifying restrictions implied by such a process from low-frequency data. We also explore implications for stationary, possibly nonreversible diffusion processes. Finally, we suggest a sieve method to estimate the NPCs from discretely-sampled data.

Article information

Ann. Statist. Volume 37, Number 6B (2009), 4279-4312.

First available in Project Euclid: 23 October 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H25: Factor analysis and principal components; correspondence analysis 47D07: Markov semigroups and applications to diffusion processes {For Markov processes, see 60Jxx}
Secondary: 35P05: General topics in linear spectral theory

Nonlinear principal components multivariate diffusion quadratic form conditional expectations operator low-frequency data


Chen, Xiaohong; Hansen, Lars Peter; Scheinkman, José. Nonlinear principal components and long-run implications of multivariate diffusions. Ann. Statist. 37 (2009), no. 6B, 4279--4312. doi:10.1214/09-AOS706.

Export citation


  • [1] Agmon, S. (1965). Lectures on Elliptic Boundary Problems. Van Nostrand, Princeton, NJ.
  • [2] Azencott, R. (1974). Behavior of diffusion semi-groups at infinity. Bull. Soc. Math. France 102 193–240.
  • [3] Banon, G. (1978). Nonparametric identification of diffusions. SIAM J. Control Optim. 16 380–395.
  • [4] Benko, M., Hardle, W. and Kneip, A. (2009). Common functional principal components. Ann. Statist. 37 1–34.
  • [5] Beurling, A. and Deny, J. (1958). Espaces de Dirichlet i, le cas elementaire. Acta Math. 99 203–224.
  • [6] Bhattacharya, R. N. (1982). On the functional central limit theorem and the law of the iterated logarithm for Markov processes. Z. Wahrsch. Verw. Gebiete 60 185–201.
  • [7] Box, G. E. P. and Tiao, G. C. (1977). Canonical analysis of multiple time series. Biometrika 64 355–365.
  • [8] Brezis, H. (1983). Analyse Fonctionelle. Masson, Paris.
  • [9] Chen, X., Hansen, L. P. and Scheinkman, J. (1998). Shape-preserving estimation of diffusions. Working paper, Univ. Chicago.
  • [10] Cobb, L. P., Kopstein, P. and Chen, N. (1983). Estimation and moment recursion relations for multimodal distributions in the exponential family. J. Amer. Statist. Assoc. 78 124–130.
  • [11] Darolles, S., Florens, J. P. and Gourieroux, C. (2004). Kernel based nonlinear canonical analysis and time reversibility. J. Econometrics 119 323–353.
  • [12] Dauxois, J. and Nkiet, G. M. (1998). Nonlinear canonical analysis and independence tests. Ann. Statist. 26 1254–1278.
  • [13] Davies, E. B. (1985). l1 properties of second order operators. Bull. London Math. Soc. 17 417–436.
  • [14] Davies, E. B. (1989). Heat Kernels and Spectral Theory. Cambridge Univ. Press, Cambridge.
  • [15] Demoura, S. (1998). The nonparametric estimation of the expected value operator. Workshop Presentation, Univ. Chicago.
  • [16] Fan, J. (2005). A selective overview of nonparametric methods in financial econometrics (with discussion). Statist. Science 20 317–357.
  • [17] Florens, J. P., Renault, E. and Touzi, N. (1998). Testing for embeddability by stationary reversible continuous-time Markov processes. Econometric Theory 69 744–69.
  • [18] Fukushima, M., Oshima, Y. and Takeda, M. (1994). Dirichlet Forms and Symmetric Markov Processes. Walter de Gruyter, Berlin.
  • [19] Gobet, E., Hoffmann, M. and Reib, M. (2004). Nonparametric estimation of scalar diffusions based on low frequency data. Ann. Statist. 26 2223–2253.
  • [20] Hall, P., Muller, H. G. and Wang, J. L. (2006). Properties of principal components methods for functional and longitudinal data analysis. Ann. Statist. 34 1493–1517.
  • [21] Hansen, L. P. and Scheinkman, J. (1995). Back to the future. Econometrica 63 767–804.
  • [22] Hansen, L. P., Scheinkman, J. and Touzi, N. (1998). Spectral methods for identifying scalar diffusions. J. Econometrics 86 1–32.
  • [23] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 17 417–441.
  • [24] Kessler, M. and Sorensen, M. (1999). Estimating equations based on eigenfunctions for a discretely observed diffusion process. Bernoulli 5 299–314.
  • [25] Nelson, E. (1958). The adjoint Markov process. Duke Math. J. 25 671–690.
  • [26] Pan, J. and Yao, Q. (2008). Modelling multiple time series via common factors. Biometrika 95 365–379. Available at,
  • [27] Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2 559–572.
  • [28] Ramsay, J. and Silverman, B. W. (2005). Functional Data Analysis. Springer, New York.
  • [29] Reed, M. and Simon, B. (1978). Methods of Modern Mathematical Physics IV: Analysis of Operators. Academic Press, San Diego.
  • [30] Rudin, W. (1973). Functional Analysis. McGraw-Hill, New York.
  • [31] Salinelli, E. (1998). Nonlinear principal components i: Absolutely continuous variables. Ann. Statist. 26 596–616.
  • [32] Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1–24.
  • [33] Wong, E. (1964). The construction of a class of stationary Markov processes. In Stochastic Processes in Mathematical Physics and Engineering (R. E. Bellman, ed.). Proceedings of Symposia in Applied Mathematics 17 264–276. Amer. Math. Soc., Providence, RI.
  • [34] Zhou, J. and He, X. (2008). Dimension reduction based on constrained cononical correlation and variable filtering. Ann. Statist. 36 1649–1668.
  • [35] Zhou, L., Huang, J. Z. and Carroll, R. J. (2008). Joint modelling of paried sparse functional data using principal components. Biometrika 95 601–619.