## The Annals of Statistics

### Universality for the largest eigenvalue of sample covariance matrices with general population

#### Abstract

This paper is aimed at deriving the universality of the largest eigenvalue of a class of high-dimensional real or complex sample covariance matrices of the form $\mathcal{W}_{N}=\Sigma^{1/2}XX^{*}\Sigma^{1/2}$. Here, $X=(x_{ij})_{M,N}$ is an $M\times N$ random matrix with independent entries $x_{ij}$, $1\leq i\leq M$, $1\leq j\leq N$ such that $\mathbb{E}x_{ij}=0$, $\mathbb{E}|x_{ij}|^{2}=1/N$. On dimensionality, we assume that $M=M(N)$ and $N/M\rightarrow d\in(0,\infty)$ as $N\rightarrow\infty$. For a class of general deterministic positive-definite $M\times M$ matrices $\Sigma$, under some additional assumptions on the distribution of $x_{ij}$’s, we show that the limiting behavior of the largest eigenvalue of $\mathcal{W}_{N}$ is universal, via pursuing a Green function comparison strategy raised in [Probab. Theory Related Fields 154 (2012) 341–407, Adv. Math. 229 (2012) 1435–1515] by Erdős, Yau and Yin for Wigner matrices and extended by Pillai and Yin [Ann. Appl. Probab. 24 (2014) 935–1001] to sample covariance matrices in the null case ($\Sigma=I$). Consequently, in the standard complex case ($\mathbb{E}x_{ij}^{2}=0$), combing this universality property and the results known for Gaussian matrices obtained by El Karoui in [Ann. Probab. 35 (2007) 663–714] (nonsingular case) and Onatski in [Ann. Appl. Probab. 18 (2008) 470–490] (singular case), we show that after an appropriate normalization the largest eigenvalue of $\mathcal{W}_{N}$ converges weakly to the type 2 Tracy–Widom distribution $\mathrm{TW}_{2}$. Moreover, in the real case, we show that when $\Sigma$ is spiked with a fixed number of subcritical spikes, the type 1 Tracy–Widom limit $\mathrm{TW}_{1}$ holds for the normalized largest eigenvalue of $\mathcal{W}_{N}$, which extends a result of Féral and Péché in [J. Math. Phys. 50 (2009) 073302] to the scenario of nondiagonal $\Sigma$ and more generally distributed $X$. In summary, we establish the Tracy–Widom type universality for the largest eigenvalue of generally distributed sample covariance matrices under quite light assumptions on $\Sigma$. Applications of these limiting results to statistical signal detection and structure recognition of separable covariance matrices are also discussed.

#### Article information

Source
Ann. Statist., Volume 43, Number 1 (2015), 382-421.

Dates
First available in Project Euclid: 6 February 2015

Permanent link to this document
https://projecteuclid.org/euclid.aos/1423230084

Digital Object Identifier
doi:10.1214/14-AOS1281

Mathematical Reviews number (MathSciNet)
MR3311864

Zentralblatt MATH identifier
06420692

#### Citation

Bao, Zhigang; Pan, Guangming; Zhou, Wang. Universality for the largest eigenvalue of sample covariance matrices with general population. Ann. Statist. 43 (2015), no. 1, 382--421. doi:10.1214/14-AOS1281. https://projecteuclid.org/euclid.aos/1423230084

#### References

• [1] Bai, Z. and Silverstein, J. W. (2010). Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer, New York.
• [2] Bai, Z. and Yao, J.-f. (2008). Central limit theorems for eigenvalues in a spiked population model. Ann. Inst. Henri Poincaré Probab. Stat. 44 447–474.
• [3] Bai, Z. D. (1999). Methodologies in spectral analysis of large-dimensional random matrices, a review. Statist. Sinica 9 611–677.
• [4] Baik, J., Ben Arous, G. and Péché, S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 1643–1697.
• [5] Bao, Z., Pan, G. and Zhou, W. (2012). Tracy–Widom law for the extreme eigenvalues of sample correlation matrices. Electron. J. Probab. 17 1–32.
• [6] Bao, Z., Pan, G. and Zhou, W. (2014). Supplement to “Universality for the largest eigenvalue of sample covariance matrices with general population.” DOI:10.1214/14-AOS1281SUPP.
• [7] Bao, Z. G., Pan, G. M. and Zhou, W. (2013). Local density of the spectrum on the edge for sample covariance matrices with general population. Preprint.
• [8] Bianchi, P., Debbah, M., Maida, M. and Najim, J. (2011). Performance of statistical tests for single-source detection using random matrix theory. IEEE Trans. Inform. Theory 57 2400–2419.
• [9] Bloemendal, A. and Virág, B. (2011). Limits of spiked random matrices II. Preprint. Available at arXiv:1109.3704.
• [10] Bloemendal, A. and Virág, B. (2013). Limits of spiked random matrices I. Probab. Theory Related Fields 156 795–825.
• [11] Eaton, M. L. (1989). Group Invariance Applications in Statistics. NSF-CBMS Regional Conference Series in Probability and Statistics 1. IMS, Hayward, CA.
• [12] El Karoui, N. (2007). Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Probab. 35 663–714.
• [13] El Karoui, N. (2009). Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond. Ann. Appl. Probab. 19 2362–2405.
• [14] Erdős, L. (2011). Universality of Wigner random matrices: A survey of recent results. Russ. Math. Surv. 66 507.
• [15] Erdős, L., Knowles, A., Yau, H.-T. and Yin, J. (2012). Spectral statistics of Erdös–Rényi graphs II: Eigenvalue spacing and the extreme eigenvalues. Comm. Math. Phys. 314 587–640.
• [16] Erdős, L., Schlein, B. and Yau, H.-T. (2009). Semicircle law on short scales and delocalization of eigenvectors for Wigner random matrices. Ann. Probab. 37 815–852.
• [17] Erdős, L., Schlein, B. and Yau, H.-T. (2009). Local semicircle law and complete delocalization for Wigner random matrices. Comm. Math. Phys. 287 641–655.
• [18] Erdős, L., Schlein, B. and Yau, H.-T. (2010). Wegner estimate and level repulsion for Wigner random matrices. Int. Math. Res. Not. IMRN 2010 436–479.
• [19] Erdős, L., Schlein, B. and Yau, H.-T. (2011). Universality of random matrices and local relaxation flow. Invent. Math. 185 75–119.
• [20] Erdős, L., Schlein, B., Yau, H.-T. and Yin, J. (2012). The local relaxation flow approach to universality of the local statistics for random matrices. Ann. Inst. Henri Poincaré Probab. Stat. 48 1–46.
• [21] Erdős, L., Yau, H.-T. and Yin, J. (2011). Universality for generalized Wigner matrices with Bernoulli distribution. J. Comb. 2 15–82.
• [22] Erdős, L., Yau, H.-T. and Yin, J. (2012). Bulk universality for generalized Wigner matrices. Probab. Theory Related Fields 154 341–407.
• [23] Erdős, L., Yau, H.-T. and Yin, J. (2012). Rigidity of eigenvalues of generalized Wigner matrices. Adv. Math. 229 1435–1515.
• [24] Féral, D. and Péché, S. (2009). The largest eigenvalues of sample covariance matrices for a spiked population: Diagonal case. J. Math. Phys. 50 073302.
• [25] Fisher, R. A. (1939). The sampling distribution of some statistics obtained from non-linear equations. Ann. Eugenics 9 238–249.
• [26] Hsu, P. L. (1939). On the distribution of roots of certain determinantal equations. Ann. Eugenics 9 250–258.
• [27] Johansson, K. (2000). Shape fluctuations and random matrices. Comm. Math. Phys. 209 437–476.
• [28] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
• [29] Johnstone, I. M. (2007). High dimensional statistical inference and random matrices. In International Congress of Mathematicians I 307–333. Eur. Math. Soc., Zürich.
• [30] Kay, S. M. (1998). Fundamentals of Statistical Signal Processing, Vol. II: Detection Theory. Prentice Hall, Upper Saddle River, NJ.
• [31] Lee, J. O. and Yin, J. (2014). A necessary and sufficient condition for edge universality of Wigner matrices. Duke Math. J. 163 117–173.
• [32] Lindeberg, J. W. (1922). Eine neue Herleitung des Exponentialgesetzes in der Wahrscheinlichkeitsrechnung. Math. Z. 15 211–225.
• [33] Marčenko, V. A. and Pastur, L. A. (1967). Distribution for some sets of random matrices. Math. USSR-Sb. 1 457–483.
• [34] Mo, M. Y. (2012). Rank 1 real Wishart spiked model. Comm. Pure Appl. Math. 65 1528–1638.
• [35] Nadakuditi, R. R. and Edelman, A. (2008). Sample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples. IEEE Trans. Signal Process. 56 2625–2638.
• [36] Nadakuditi, R. R. and Silverstein, J. W. (2010). Fundamental limit of sample generalized eigenvalue based detection of signals in noise using relatively few signal-bearing and noise-only samples. IEEE J. Sel. Top. Signal Process. 4 468–480.
• [37] Onatski, A. (2007). A formal statistical test for the number of factors in the approximate factor models. Unpublished manuscript.
• [38] Onatski, A. (2008). The Tracy–Widom limit for the largest eigenvalues of singular complex Wishart matrices. Ann. Appl. Probab. 18 470–490.
• [39] Onatski, A. (2009). Testing hypotheses about the numbers of factors in large factor models. Econometrica 77 1447–1479.
• [40] Onatski, A., Moreira, M. J. and Hallin, M. (2013). Asymptotic power of sphericity tests for high-dimensional data. Ann. Statist. 41 1204–1231.
• [41] Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 1617–1642.
• [42] Paul, D. and Silverstein, J. W. (2009). No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix. J. Multivariate Anal. 100 37–57.
• [43] Péché, S. (2009). Universality results for the largest eigenvalues of some sample covariance matrix ensembles. Probab. Theory Related Fields 143 481–516.
• [44] Penna, F. and Garello, R. (2009). Theoretical performance analysis of eigenvalue-based detection. Available at arXiv:0907.1523.
• [45] Pillai, N. S. and Yin, J. (2012). Edge universality of correlation matrices. Ann. Statist. 40 1737–1763.
• [46] Pillai, N. S. and Yin, J. (2014). Universality of covariance matrices. Ann. Appl. Probab. 24 935–1001.
• [47] Roy, S. N. (1939). p-Statistics and some generalizations in analysis of variance appropriate to multivariate problems. Sankhyā 4 381–396.
• [48] Silverstein, J. W. and Choi, S.-I. (1995). Analysis of the limiting spectral distribution of large-dimensional random matrices. J. Multivariate Anal. 54 295–309.
• [49] Soshnikov, A. (2002). A note on universality of the distribution of the largest eigenvalues in certain sample covariance matrices. J. Stat. Phys. 108 1033–1056.
• [50] Tao, T. and Vu, V. (2011). Random matrices: Universality of local eigenvalue statistics. Acta Math. 206 127–204.
• [51] Tao, T. and Vu, V. (2012). Random covariance matrices: Universality of local statistics of eigenvalues. Ann. Probab. 40 1285–1315.
• [52] Tracy, C. A. and Widom, H. (1994). Level-spacing distributions and the Airy kernel. Comm. Math. Phys. 159 151–174.
• [53] Tracy, C. A. and Widom, H. (1996). On orthogonal and symplectic matrix ensembles. Comm. Math. Phys. 177 727–754.
• [54] Vinogradova, J., Couillet, R. and Hachem, W. (2013). Statistical inference in large antenna arrays under unknown noise pattern. IEEE Trans. Signal Process. 61 5633–5645.
• [55] Wang, D. (2012). The largest eigenvalue of real symmetric, Hermitian and Hermitian self-dual random matrix models with rank one external source, Part I. J. Stat. Phys. 146 719–761.
• [56] Wang, K. (2012). Random covariance matrices: Universality of local statistics of eigenvalues up to the edge. Random Matrices Theory Appl. 1 1150005, 24.
• [57] Wang, L. and Paul, D. (2014). Limiting spectral distribution of renormalized separable sample covariance matrices when $p/n\to 0$. J. Multivariate Anal. 126 25–52.
• [58] Zhang, L. X. (2006). Spectral analysis of large dimensional random matrices. Ph.D. thesis, National University of Singapore.