Annales de l'Institut Henri Poincaré, Probabilités et Statistiques

Central limit theorems for eigenvalues in a spiked population model

Zhidong Bai and Jian-feng Yao

Full-text: Open access

Abstract

In a spiked population model, the population covariance matrix has all its eigenvalues equal to units except for a few fixed eigenvalues (spikes). This model is proposed by Johnstone to cope with empirical findings on various data sets. The question is to quantify the effect of the perturbation caused by the spike eigenvalues. A recent work by Baik and Silverstein establishes the almost sure limits of the extreme sample eigenvalues associated to the spike eigenvalues when the population and the sample sizes become large. This paper establishes the limiting distributions of these extreme sample eigenvalues. As another important result of the paper, we provide a central limit theorem on random sesquilinear forms.

Résumé

Dans un modèle de variances hétérogènes, les valeurs propres de la matrice de covariance des variables sont toutes égales à l’unité sauf un faible nombre d’entre elles. Ce modèle a été introduit par Johnstone comme une explication possible de la structure des valeurs propres de la matrice de covariance empirique constatée sur plusieurs ensembles de données réelles. Une question importante est de quantifier la perturbation causée par ces valeurs propres différentes de l’unité. Un travail récent de Baik et Silverstein établit la limite presque sûre des valeurs propres empiriques extrêmes lorsque le nombre de variables tend vers l’infini proportionnellement à la taille de l’échantillon. Ce travail établit un théorème limite central pour ces valeurs propres empiriques extrêmes. Il est basé sur un nouveau théorème limite central pour les formes sesquilinéaires aléatoires.

Article information

Source
Ann. Inst. H. Poincaré Probab. Statist., Volume 44, Number 3 (2008), 447-474.

Dates
First available in Project Euclid: 26 May 2008

Permanent link to this document
https://projecteuclid.org/euclid.aihp/1211819420

Digital Object Identifier
doi:10.1214/07-AIHP118

Mathematical Reviews number (MathSciNet)
MR2451053

Zentralblatt MATH identifier
1274.62129

Subjects
Primary: 62H25: Factor analysis and principal components; correspondence analysis 62E20: Asymptotic distribution theory
Secondary: 60F05: Central limit and other weak theorems 15A52

Keywords
Sample covariance matrices Spiked population model Central limit theorems Largest eigenvalue Extreme eigenvalues Random sesquilinear forms Random quadratic forms

Citation

Bai, Zhidong; Yao, Jian-feng. Central limit theorems for eigenvalues in a spiked population model. Ann. Inst. H. Poincaré Probab. Statist. 44 (2008), no. 3, 447--474. doi:10.1214/07-AIHP118. https://projecteuclid.org/euclid.aihp/1211819420


Export citation

References

  • Z. D. Bai, B. Q. Miao and C. R. Rao. Estimation of direction of arrival of signals: Asymptotic results. Advances in Spectrum Analysis and Array Processing, S. Haykins (Ed.), vol. II, pp. 327–347. Prentice Hall's West Nyack, New York, 1991.
  • Z. D. Bai. A note on limiting distribution of the eigenvalues of a class of random matrice. J. Math. Res. Exposition 5 (1985) 113–118.
  • Z. D. Bai. Methodologies in spectral analysis of large dimensional random matrices, a review. Statist. Sinica 9 (1999) 611–677.
  • Z. D. Bai and J. W. Silverstein. CLT for linear spectral statistics of large-dimensional sample covariance matrices. Ann. Probab. 32 (2004) 553–605.
  • Z. D. Bai and J. W. Silverstein. No eigenvalues outside the support of the limiting spectral distribution of large dimensional sample covariance matrices. Ann. Probab. 26 (1998) 316–345.
  • J. Baik and J. W. Silverstein. Eigenvalues of large sample covariance matrices of spiked population models. J. Multivariate Anal. 97 (2006) 1382–1408.
  • J. Baik, G. Ben Arous and S. Péché. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 (2005) 1643–1697.
  • R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1985.
  • I. Johnstone. On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 (2001) 295–327.
  • V. A. Marčenko and L. A. Pastur. Distribution of eigenvalues for some sets of random matrices. Math. USSR-Sb 1 (1967) 457–483.
  • M. L. Mehta. Random Matrices. Academic Press, New York, 1991.
  • D. Paul. Asymptotics of the leading sample eigenvalues for a spiked covariance model. Statistica Sinica 17 (2007) 1617–1642.
  • S. J. Sheather and M. C. Jones. A reliable data-based bandwidth selection method for kernel density estimation. J. Roy. Stat. Soc. Ser. B 53 (1991) 683–690.