The Annals of Probability

Largest entries of sample correlation matrices from equi-correlated normal populations

Jianqing Fan and Tiefeng Jiang

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

The paper studies the limiting distribution of the largest off-diagonal entry of the sample correlation matrices of high-dimensional Gaussian populations with equi-correlation structure. Assume the entries of the population distribution have a common correlation coefficient $\rho >0$ and both the population dimension $p$ and the sample size $n$ tend to infinity with $\log p=o(n^{\frac{1}{3}})$. As $0<\rho <1$, we prove that the largest off-diagonal entry of the sample correlation matrix converges to a Gaussian distribution, and the same is true for the sample covariance matrix as $0<\rho <1/2$. This differs substantially from a well-known result for the independent case where $\rho =0$, in which the above limiting distribution is an extreme-value distribution. We then study the phase transition between these two limiting distributions and identify the regime of $\rho $ where the transition occurs. If $\rho $ is less than, larger than or is equal to the threshold, the corresponding limiting distribution is the extreme-value distribution, the Gaussian distribution and a convolution of the two distributions, respectively. The proofs rely on a subtle use of the Chen–Stein Poisson approximation method, conditioning, a coupling to create independence and a special property of sample correlation matrices. An application is given for a statistical testing problem.

Article information

Source
Ann. Probab., Volume 47, Number 5 (2019), 3321-3374.

Dates
Received: September 2017
Revised: January 2019
First available in Project Euclid: 22 October 2019

Permanent link to this document
https://projecteuclid.org/euclid.aop/1571731453

Digital Object Identifier
doi:10.1214/19-AOP1341

Mathematical Reviews number (MathSciNet)
MR4021253

Subjects
Primary: 62H10: Distribution of statistics 62E20: Asymptotic distribution theory
Secondary: 60F05: Central limit and other weak theorems

Keywords
Maximum sample correlation phase transition multivariate normal distribution Gumbel distribution Chen–Stein Poisson approximation

Citation

Fan, Jianqing; Jiang, Tiefeng. Largest entries of sample correlation matrices from equi-correlated normal populations. Ann. Probab. 47 (2019), no. 5, 3321--3374. doi:10.1214/19-AOP1341. https://projecteuclid.org/euclid.aop/1571731453


Export citation

References

  • [1] Arratia, R., Goldstein, L. and Gordon, L. (1989). Two moments suffice for Poisson approximations: The Chen–Stein method. Ann. Probab. 17 9–25.
  • [2] Cai, T., Fan, J. and Jiang, T. (2013). Distributions of angles in random packing on spheres. J. Mach. Learn. Res. 14 1837–1864.
  • [3] Cai, T. T. and Jiang, T. (2011). Limiting laws of coherence of random matrices with applications to testing covariance structure and construction of compressed sensing matrices. Ann. Statist. 39 1496–1525.
  • [4] Cai, T. T. and Jiang, T. (2012). Phase transition in limiting distributions of coherence of high-dimensional random matrices. J. Multivariate Anal. 107 24–39.
  • [5] Chen, L. H. Y., Fang, X. and Shao, Q.-M. (2013). From Stein identities to moderate deviations. Ann. Probab. 41 262–293.
  • [6] Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Statist. 41 2786–2819.
  • [7] Eaton, M. L. (1983). Multivariate Statistics: A Vector Space Approach. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. Wiley, New York.
  • [8] Fan, J., Shao, Q.-M. and Zhou, W.-X. (2018). Are discoveries spurious? Distributions of maximum spurious correlations and their applications. Ann. Statist. 46 989–1017.
  • [9] Jiang, T. (2004). The asymptotic distributions of the largest entries of sample correlation matrices. Ann. Appl. Probab. 14 865–880.
  • [10] Jiang, T. (2004). The limiting distributions of eigenvalues of sample correlation matrices. Sankhyā 66 35–48.
  • [11] Leadbetter, M. R., Lindgren, G. and Rootzén, H. (1983). Extremes and Related Properties of Random Sequences and Processes. Springer Series in Statistics. Springer, New York.
  • [12] Li, D., Liu, W.-D. and Rosalsky, A. (2010). Necessary and sufficient conditions for the asymptotic distribution of the largest entry of a sample correlation matrix. Probab. Theory Related Fields 148 5–35.
  • [13] Li, D., Qi, Y. and Rosalsky, A. (2012). On Jiang’s asymptotic distribution of the largest entry of a sample correlation matrix. J. Multivariate Anal. 111 256–270.
  • [14] Li, D. and Rosalsky, A. (2006). Some strong limit theorems for the largest entries of sample correlation matrices. Ann. Appl. Probab. 16 423–447.
  • [15] Linnik, Y. V. (1961). On the probability of large deviations for the sums of independent variables. In Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. II 289–306. Univ. California Press, Berkeley, CA.
  • [16] Liu, W.-D., Lin, Z. and Shao, Q.-M. (2008). The asymptotic distribution and Berry–Esseen bound of a new test for independence in high dimension with an application to stochastic optimization. Ann. Appl. Probab. 18 2337–2366.
  • [17] Muirhead, R. J. (2009). Aspects of Multivariate Statistical Theory. Wiley Series in Probability and Statistics 197. Wiley, New York.
  • [18] Shao, Q.-M. and Wang, Q. (2013). Self-normalized limit theorems: A survey. Probab. Surv. 10 69–93.
  • [19] Shao, Q.-M. and Zhou, W.-X. (2014). Necessary and sufficient conditions for the asymptotic distributions of coherence of ultra-high dimensional random matrices. Ann. Probab. 42 623–648.
  • [20] Zhou, W. (2007). Asymptotic distribution of the largest off-diagonal entry of correlation matrices. Trans. Amer. Math. Soc. 359 5345–5363.