Abstract
In this paper, we prove a CLT for the sample canonical correlation coefficients between two high-dimensional random vectors with finite rank correlations. More precisely, consider two random vectors and , where , and are independent random vectors with i.i.d. entries of mean zero and variance one, and and are two arbitrary deterministic matrices. Given n samples of and , we stack them into two matrices and , where , and are random matrices with i.i.d. entries of mean zero and variance one. Let be the largest r eigenvalues of the sample canonical correlation (SCC) matrix , and let be the squares of the population canonical correlation coefficients between and . Under certain moment assumptions, we show that there exists a threshold such that if , then converges weakly to a centered normal distribution, where is a fixed outlier location determined by . Our proof uses a self-adjoint linearization of the SCC matrix and a sharp local law on the inverse of the linearized matrix.
Funding Statement
Partially supported by the Wharton Dean’s Fund for Postdoctoral Research.
Acknowledgments
I want to thank Zongming Ma for bringing this problem to my attention and for valuable suggestions. I also want to thank Edgar Dobriban, David Hong and Yue Sheng for fruitful discussions. I am grateful to the editor, the associated editor and an anonymous referee for their helpful comments, which have resulted in a significant improvement.
Citation
Fan Yang. "Limiting distribution of the sample canonical correlation coefficients of high-dimensional random vectors." Electron. J. Probab. 27 1 - 71, 2022. https://doi.org/10.1214/22-EJP814
Information