Open Access
2022 Limiting distribution of the sample canonical correlation coefficients of high-dimensional random vectors
Fan Yang
Author Affiliations +
Electron. J. Probab. 27: 1-71 (2022). DOI: 10.1214/22-EJP814

Abstract

In this paper, we prove a CLT for the sample canonical correlation coefficients between two high-dimensional random vectors with finite rank correlations. More precisely, consider two random vectors x˜=x+Az and y˜=y+Bz, where xRp, yRq and zRr are independent random vectors with i.i.d. entries of mean zero and variance one, and ARp×r and BRq×r are two arbitrary deterministic matrices. Given n samples of x˜ and y˜, we stack them into two matrices X=X+AZ and Y=Y+BZ, where XRp×n, YRq×n and ZRr×n are random matrices with i.i.d. entries of mean zero and variance one. Let λ˜1λ˜2λ˜r be the largest r eigenvalues of the sample canonical correlation (SCC) matrix CXY=(XX)12XY(YY)1YX(XX)12, and let t1t2tr be the squares of the population canonical correlation coefficients between x˜ and y˜. Under certain moment assumptions, we show that there exists a threshold tc(0,1) such that if ti>tc, then n(λ˜iθi) converges weakly to a centered normal distribution, where θi is a fixed outlier location determined by ti. Our proof uses a self-adjoint linearization of the SCC matrix and a sharp local law on the inverse of the linearized matrix.

Funding Statement

Partially supported by the Wharton Dean’s Fund for Postdoctoral Research.

Acknowledgments

I want to thank Zongming Ma for bringing this problem to my attention and for valuable suggestions. I also want to thank Edgar Dobriban, David Hong and Yue Sheng for fruitful discussions. I am grateful to the editor, the associated editor and an anonymous referee for their helpful comments, which have resulted in a significant improvement.

Citation

Download Citation

Fan Yang. "Limiting distribution of the sample canonical correlation coefficients of high-dimensional random vectors." Electron. J. Probab. 27 1 - 71, 2022. https://doi.org/10.1214/22-EJP814

Information

Received: 1 September 2021; Accepted: 24 June 2022; Published: 2022
First available in Project Euclid: 21 July 2022

MathSciNet: MR4455877
zbMATH: 1498.60102
Digital Object Identifier: 10.1214/22-EJP814

Subjects:
Primary: 60B20 , 62E20 , 62H99

Keywords: BBP transition , canonical correlation analysis , CLT , spiked eigenvalues

Vol.27 • 2022
Back to Top