## The Annals of Statistics

### Normal approximation and concentration of spectral projectors of sample covariance

#### Abstract

Let $X,X_{1},\dots,X_{n}$ be i.i.d. Gaussian random variables in a separable Hilbert space $\mathbb{H}$ with zero mean and covariance operator $\Sigma=\mathbb{E}(X\otimes X)$, and let $\hat{\Sigma}:=n^{-1}\sum_{j=1}^{n}(X_{j}\otimes X_{j})$ be the sample (empirical) covariance operator based on $(X_{1},\dots,X_{n})$. Denote by $P_{r}$ the spectral projector of $\Sigma$ corresponding to its $r$th eigenvalue $\mu_{r}$ and by $\hat{P}_{r}$ the empirical counterpart of $P_{r}$. The main goal of the paper is to obtain tight bounds on

$\sup_{x\in\mathbb{R}}\vert\mathbb{P} \{\frac{\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2}-\mathbb{E}\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2}}{\operatorname{Var}^{1/2}(\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2})}\leq x\}-\Phi (x)\vert ,$ where $\Vert \cdot \Vert_{2}$ denotes the Hilbert–Schmidt norm and $\Phi$ is the standard normal distribution function. Such accuracy of normal approximation of the distribution of squared Hilbert–Schmidt error is characterized in terms of so-called effective rank of $\Sigma$ defined as ${\mathbf{r}}(\Sigma)=\frac{\operatorname{tr}(\Sigma)}{\Vert \Sigma \Vert_{\infty}}$, where $\operatorname{tr}(\Sigma)$ is the trace of $\Sigma$ and $\Vert \Sigma \Vert_{\infty}$ is its operator norm, as well as another parameter characterizing the size of $\operatorname{Var}(\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2})$. Other results include nonasymptotic bounds and asymptotic representations for the mean squared Hilbert–Schmidt norm error $\mathbb{E}\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2}$ and the variance $\operatorname{Var}(\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2})$, and concentration inequalities for $\Vert \hat{P}_{r}-P_{r}\Vert_{2}^{2}$ around its expectation.

#### Article information

Source
Ann. Statist., Volume 45, Number 1 (2017), 121-157.

Dates
Revised: January 2016
First available in Project Euclid: 21 February 2017

https://projecteuclid.org/euclid.aos/1487667619

Digital Object Identifier
doi:10.1214/16-AOS1437

Mathematical Reviews number (MathSciNet)
MR3611488

Zentralblatt MATH identifier
1367.62175

Subjects
Primary: 62H12: Estimation

#### Citation

Koltchinskii, Vladimir; Lounici, Karim. Normal approximation and concentration of spectral projectors of sample covariance. Ann. Statist. 45 (2017), no. 1, 121--157. doi:10.1214/16-AOS1437. https://projecteuclid.org/euclid.aos/1487667619

#### References

• [1] Birnbaum, A., Johnstone, I. M., Nadler, B. and Paul, D. (2013). Minimax bounds for sparse PCA with noisy high-dimensional data. Ann. Statist. 41 1055–1084.
• [2] Bunea, F. and Xiao, L. (2015). On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA. Bernoulli 21 1200–1230.
• [3] Cai, T. T., Ma, Z. and Wu, Y. (2013). Sparse PCA: Optimal rates and adaptive estimation. Ann. Statist. 41 3074–3110.
• [4] Dauxois, J., Pousse, A. and Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal. 12 136–154.
• [5] de Jong, P. (1987). A central limit theorem for generalized quadratic forms. Probab. Theory Related Fields 75 261–277.
• [6] Eichelsbacher, P. and Thäle, C. (2014). New Berry–Esseen bounds for non-linear functionals of Poisson random measures. Electron. J. Probab. 19 no. 102, 25.
• [7] Haeusler, E. (1988). On the rate of convergence in the central limit theorem for martingales with discrete and continuous time. Ann. Probab. 16 275–299.
• [8] Hall, P. (1984). Central limit theorem for integrated square error of multivariate nonparametric density estimators. J. Multivariate Anal. 14 1–16.
• [9] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
• [10] Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 682–693.
• [11] Kato, T. (1980). Perturbation Theory for Linear Operators. Springer, New York.
• [12] Kneip, A. and Utikal, K. J. (2001). Inference for density families using functional principal component analysis. J. Amer. Statist. Assoc. 96 519–542.
• [13] Koltchinskii, V. and Lounici, K. (2016). Asymtotics and concentration bounds for bilinear forms of spectral projectors of sample covariance. Ann. Inst. H. Poincaré Probab. Statist. 52 1976–2013.
• [14] Koltchinskii, V. and Lounici, K. (2016). Concentration inequalities and moment bounds for sample covariance operators. Bernoulli. To appear. Available at arXiv:1405.2468.
• [15] Koltchinskii, V. I. (1998). Asymptotics of spectral projections of some random matrices approximating integral operators. In High Dimensional Probability (Oberwolfach, 1996). Progress in Probability 43 191–227. Birkhäuser, Basel.
• [16] Ledoux, M. (2001). The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs 89. Amer. Math. Soc., Providence, RI.
• [17] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces: Isoperimetry and Processes. Springer, Berlin.
• [18] Lounici, K. (2013). Sparse principal component analysis with missing observations. In High Dimensional Probability VI. Progress in Probability 66 327–356. Birkhäuser, Basel.
• [19] Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 1617–1642.
• [20] Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge Univ. Press, Cambridge.
• [21] Vu, V. and Lei, J. (2012). Minimax rates of estimation for sparse PCA in high dimensions. J. Mach. Learn. Res. 22 1278–1286.