The Annals of Applied Probability

Determinant of sample correlation matrix with application

Tiefeng Jiang

Abstract

Let $\mathbf{x}_{1},\ldots ,\mathbf{x}_{n}$ be independent random vectors of a common $p$-dimensional normal distribution with population correlation matrix $\mathbf{R}_{n}$. The sample correlation matrix $\hat{\mathbf {R}}_{n}=(\hat{r}_{ij})_{p\times p}$ is generated from $\mathbf{x}_{1},\ldots ,\mathbf{x}_{n}$ such that $\hat{r}_{ij}$ is the Pearson correlation coefficient between the $i$th column and the $j$th column of the data matrix $(\mathbf{x}_{1},\ldots ,\mathbf{x}_{n})'$. The matrix $\hat{\mathbf {R}}_{n}$ is a popular object in multivariate analysis and it has many connections to other problems. We derive a central limit theorem (CLT) for the logarithm of the determinant of $\hat{\mathbf {R}}_{n}$ for a big class of $\mathbf{R}_{n}$. The expressions of mean and the variance in the CLT are not obvious, and they are not known before. In particular, the CLT holds if $p/n$ has a nonzero limit and the smallest eigenvalue of $\mathbf{R}_{n}$ is larger than $1/2$. Besides, a formula of the moments of $\vert \hat{\mathbf {R}}_{n}\vert$ and a new method of showing weak convergence are introduced. We apply the CLT to a high-dimensional statistical test.

Article information

Source
Ann. Appl. Probab., Volume 29, Number 3 (2019), 1356-1397.

Dates
Revised: August 2017
First available in Project Euclid: 19 February 2019

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1550566833

Digital Object Identifier
doi:10.1214/17-AAP1362

Mathematical Reviews number (MathSciNet)
MR3914547

Citation

Jiang, Tiefeng. Determinant of sample correlation matrix with application. Ann. Appl. Probab. 29 (2019), no. 3, 1356--1397. doi:10.1214/17-AAP1362. https://projecteuclid.org/euclid.aoap/1550566833

References

• Anderson, T. W. (1958). An Introduction to Multivariate Statistical Analysis, 2nd ed. Wiley, New York.
• Bai, Z. and Silverstein, J. W. (2010). Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer, New York.
• Bao, Z., Pan, G. and Zhou, W. (2012). Tracy–Widom law for the extreme eigenvalues of sample correlation matrices. Electron. J. Probab. 17 1–32.
• Bartlett, M. S. (1954). A note on multiplying factors for various chi-squared approximations. J. R. Stat. Soc. Ser. B. Stat. Methodol. 16 296–298.
• Billingsley, P. (1986). Probability and Measure, 2nd ed. Wiley, New York.
• Brockwell, P. J. and Davis, R. A. (2002). Introduction to Time Series and Forecasting. Springer, New York.
• Cai, T., Fan, J. and Jiang, T. (2013). Distributions of angles in random packing on spheres. J. Mach. Learn. Res. 14 1837–1864.
• Cai, T., Liang, T. and Zhou, H. (2015). Law of log determinant of sample covariance matrix and optimal estimation of differential entropy for high-dimensional Gaussian distributions. J. Multivariate Anal. 137 161–172.
• Chow, Y. S. and Teicher, H. (1988). Probability Theory: Independence, Interchangeability, Martingales, 2nd ed. Springer, New York.
• Dembo, A. and Zeitouni, O. (1998). Large Deviations Techniques and Applications, 2nd ed. Applications of Mathematics (New York) 38. Springer, New York.
• Dong, Z., Jiang, T. and Li, D. (2012). Circular law and arc law for truncation of random unitary matrix. J. Math. Phys. 53 Article ID 013301.
• Eaton, M. L. (1983). Multivariate Statistics: A Vector Space Approach. Wiley, New York.
• Hanson, D. L. and Wright, F. T. (1971). A bound on tail probabilities for quadratic forms in independent random variables. Ann. Math. Stat. 42 1079–1083.
• Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge Univ. Press, Cambridge.
• Jiang, T. (2004a). The asymptotic distributions of the largest entries of sample correlation matrices. Ann. Appl. Probab. 14 865–880.
• Jiang, T. (2004b). The limiting distributions of eigenvalues of sample correlation matrices. Sankhyā 66 35–48.
• Jiang, T. and Qi, Y. (2015). Likelihood ratio tests for high-dimensional normal distributions. Scand. J. Stat. 42 988–1009.
• Jiang, T. and Yang, F. (2013). Central limit theorems for classical likelihood ratio tests for high-dimensional normal distributions. Ann. Statist. 41 2029–2074.
• Li, D., Liu, W. and Rosalsky, A. (2010). Necessary and sufficient conditions for the asymptotic distribution of the largest entry of a sample correlation matrix. Probab. Theory Related Fields 148 5–35.
• Li, D. and Rosalsky, A. (2006). Some strong limit theorems for the largest entries of sample correlation matrices. Ann. Appl. Probab. 16 423–447.
• Morrison, D. F. (2004). Multivariate Statistical Methods, 4th ed. Duxbury Press, Pacific Grove, CA.
• Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York.
• Nguyen, H. H. and Vu, V. (2014). Random matrices: Law of the determinant. Ann. Probab. 42 146–167.
• Rudelson, M. and Vershynin, R. (2013). Hanson–Wright inequality and sub-Gaussian concentration. Electron. Commun. Probab. 18 Article ID 82.
• Smale, S. (2000). Mathematical problems for the next century. In Mathematics: Frontiers and Perspectives (V. Arnold, M. Atiyah, P. Lax and B. Mazur, eds.) 271–294. Amer. Math. Soc., Providence, RI.
• Tao, T. and Vu, V. (2012). A central limit theorem for the determinant of a Wigner matrix. Adv. Math. 231 74–101.
• Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika 24 471–494.
• Yurinskiĭ, V. V. (1976). Exponential inequalities for sums of random vectors. J. Multivariate Anal. 6 473–499.
• Zhou, W. (2007). Asymptotic distribution of the largest off-diagonal entry of correlation matrices. Trans. Amer. Math. Soc. 359 5345–5363.