The Annals of Statistics

Extreme eigenvalues of large-dimensional spiked Fisher matrices with application

Qinwen Wang and Jianfeng Yao

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Consider two $p$-variate populations, not necessarily Gaussian, with covariance matrices $\Sigma_{1}$ and $\Sigma_{2}$, respectively. Let $S_{1}$ and $S_{2}$ be the corresponding sample covariance matrices with degrees of freedom $m$ and $n$. When the difference $\Delta$ between $\Sigma_{1}$ and $\Sigma_{2}$ is of small rank compared to $p,m$ and $n$, the Fisher matrix $S:=S_{2}^{-1}S_{1}$ is called a spiked Fisher matrix. When $p,m$ and $n$ grow to infinity proportionally, we establish a phase transition for the extreme eigenvalues of the Fisher matrix: a displacement formula showing that when the eigenvalues of $\Delta$ (spikes) are above (or under) a critical value, the associated extreme eigenvalues of $S$ will converge to some point outside the support of the global limit (LSD) of other eigenvalues (become outliers); otherwise, they will converge to the edge points of the LSD. Furthermore, we derive central limit theorems for those outlier eigenvalues of $S$. The limiting distributions are found to be Gaussian if and only if the corresponding population spike eigenvalues in $\Delta$ are simple. Two applications are introduced. The first application uses the largest eigenvalue of the Fisher matrix to test the equality between two high-dimensional covariance matrices, and explicit power function is found under the spiked alternative. The second application is in the field of signal detection, where an estimator for the number of signals is proposed while the covariance structure of the noise is arbitrary.

Article information

Ann. Statist., Volume 45, Number 1 (2017), 415-460.

Received: April 2015
Revised: March 2016
First available in Project Euclid: 21 February 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H12: Estimation
Secondary: 60F05: Central limit and other weak theorems

Large-dimensional Fisher matrices spiked Fisher matrix spiked population model extreme eigenvalue phase transition central limit theorem signal detection high-dimensional data analysis


Wang, Qinwen; Yao, Jianfeng. Extreme eigenvalues of large-dimensional spiked Fisher matrices with application. Ann. Statist. 45 (2017), no. 1, 415--460. doi:10.1214/16-AOS1463.

Export citation


  • Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis, 2nd ed. Wiley, New York.
  • Bai, Z. and Yao, J. (2008). Central limit theorems for eigenvalues in a spiked population model. Ann. Inst. Henri Poincaré Probab. Stat. 44 447–474.
  • Bai, Z. and Yao, J. (2012). On sample eigenvalues in a generalized spiked population model. J. Multivariate Anal. 106 167–177.
  • Bai, Z. D., Yin, Y. Q. and Krishnaiah, P. R. (1987). On limiting empirical distribution function of the eigenvalues of a multivariate $F$ matrix. Theory Probab. Appl. 32 490–500.
  • Bai, Z., Jiang, D., Yao, J.-F. and Zheng, S. (2009). Corrections to LRT on large-dimensional covariance matrix by RMT. Ann. Statist. 37 3822–3840.
  • Baik, J., Ben Arous, G. and Péché, S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 1643–1697.
  • Baik, J. and Silverstein, J. W. (2006). Eigenvalues of large sample covariance matrices of spiked population models. J. Multivariate Anal. 97 1382–1408.
  • Benaych-Georges, F., Guionnet, A. and Maida, M. (2011). Fluctuations of the extreme eigenvalues of finite rank deformations of random matrices. Electron. J. Probab. 16 1621–1662.
  • Benaych-Georges, F. and Nadakuditi, R. R. (2011). The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 227 494–521.
  • Cai, T., Liu, W. and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. J. Amer. Statist. Assoc. 108 265–277.
  • Capitaine, M. (2013). Additive/multiplicative free subordination property and limiting eigenvectors of spiked additive deformations of Wigner matrices and spiked sample covariance matrices. J. Theory Probab. 26 595–648.
  • Capitaine, M., Donati-Martin, C. and Féral, D. (2009). The largest eigenvalues of finite rank deformation of large Wigner matrices: Convergence and nonuniversality of the fluctuations. Ann. Probab. 37 1–47.
  • Dharmawansa, P., Johnstone, I. M. and Onatski, A. (2014). Local asymptotic normality of the spectrum of high-dimensional spiked F-ratios. Preprint. Available at arXiv:1411.3875.
  • Féral, D. and Péché, S. (2007). The largest eigenvalue of rank one deformation of large Wigner matrices. Comm. Math. Phys. 272 185–228.
  • Han, X., Pan, G. and Zhang, B. (2016). The Tracy–Widom law for the largest eigenvalue of F type matrices. Ann. Statist. 44 1564–1592.
  • Hu, J. and Bai, Z. (2014). Strong representation of weak convergence. Sci. China Math. 57 2399–2406.
  • Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • Kargin, V. (2015). On estimation in the reduced-rank regression with a large number of responses and predictors. J. Multivariate Anal. 140 377–394.
  • Kritchman, S. and Nadler, B. (2008). Determining the number of components in a factor model from limited noisy data. Chemom. Intell. Lab. Syst. 94 19–32.
  • Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. Ann. Statist. 40 908–940.
  • Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York.
  • Nadler, B. (2010). Nonparametric detection of signals by information theoretic criteria: Performance analysis and an improved estimator. IEEE Trans. Signal Process. 58 2746–2756.
  • Onatski, A. (2009). Testing hypotheses about the numbers of factors in large factor models. Econometrica 77 1447–1479.
  • Passemier, D. and Yao, J.-F. (2012). On determining the number of spikes in a high-dimensional spiked population model. Random Matrices Theory Appl. 1 1150002, 19.
  • Passemier, D. and Yao, J. (2014). Estimation of the number of spikes, possibly equal, in the high-dimensional case. J. Multivariate Anal. 127 173–183.
  • Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 1617–1642.
  • Péché, S. (2006). The largest eigenvalue of small rank perturbations of Hermitian random matrices. Probab. Theory Related Fields 134 127–173.
  • Pizzo, A., Renfrew, D. and Soshnikov, A. (2013). On finite rank deformations of Wigner matrices. Ann. Inst. Henri Poincaré Probab. Stat. 49 64–94.
  • Renfrew, D. and Soshnikov, A. (2013). On finite rank deformations of Wigner matrices II: Delocalized perturbations. Random Matrices Theory Appl. 2 1250015, 36.
  • Shi, D. (2013). Asymptotic joint distribution of extreme sample eigenvalues and eigenvectors in the spiked population model. Preprint. Available at arXiv:1304.6113.
  • Silverstein, J. W. (1985). The limiting eigenvalue distribution of a multivariate $F$ matrix. SIAM J. Math. Anal. 16 641–646.
  • Skorokhod, A. V. (1956). Limit theorems for stochastic processes. Theory Probab. Appl. 1 261–290.
  • Wachter, K. W. (1980). The limiting empirical measure of multiple discriminant ratios. Ann. Statist. 8 937–957.
  • Wang, Q., Su, Z. and Yao, J. (2014). Joint CLT for several random sesquilinear forms with applications to large-dimensional spiked population models. Electron. J. Probab. 19 1–28.
  • Zheng, S. R., Bai, Z. D. and Yao, J. F. (2013). CLT for linear spectral statistics of random matrix $S^{-1}T$. Preprint. Available at arXiv:1305.1376.