Annales de l'Institut Henri Poincaré, Probabilités et Statistiques

Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance

Vladimir Koltchinskii and Karim Lounici

Full-text: Access denied (no subscription detected) We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Let $X,X_{1},\ldots,X_{n}$ be i.i.d. Gaussian random variables with zero mean and covariance operator $\Sigma=\mathbb{E}(X\otimes X)$ taking values in a separable Hilbert space $\mathbb{H}$. Let

\[\mathbf{r}(\Sigma):=\frac{\operatorname{tr}(\Sigma)}{\|\Sigma\|_{\infty}}\] be the effective rank of $\Sigma$, $\operatorname{tr}(\Sigma)$ being the trace of $\Sigma $ and $\|\Sigma\|_{\infty}$ being its operator norm. Let

\[\hat{\Sigma}_{n}:=n^{-1}\sum_{j=1}^{n}(X_{j}\otimes X_{j})\] be the sample (empirical) covariance operator based on $(X_{1},\ldots,X_{n})$. The paper deals with a problem of estimation of spectral projectors of the covariance operator $\Sigma $ by their empirical counterparts, the spectral projectors of $\hat{\Sigma}_{n}$ (empirical spectral projectors). The focus is on the problems where both the sample size $n$ and the effective rank $\mathbf{r}(\Sigma)$ are large. This framework includes and generalizes well known high-dimensional spiked covariance models. Given a spectral projector $P_{r}$ corresponding to an eigenvalue $\mu_{r}$ of covariance operator $\Sigma $ and its empirical counterpart $\hat{P}_{r}$, we derive sharp concentration bounds for bilinear forms of empirical spectral projector $\hat{P}_{r}$ in terms of sample size $n$ and effective dimension $\mathbf{r}(\Sigma)$. Building upon these concentration bounds, we prove the asymptotic normality of bilinear forms of random operators $\hat{P}_{r}-\mathbb{E}\hat{P}_{r}$ under the assumptions that $n\to\infty $ and $\mathbf{r}(\Sigma)=o(n)$. In a special case of eigenvalues of multiplicity one, these results are rephrased as concentration bounds and asymptotic normality for linear forms of empirical eigenvectors. Other results include bounds on the bias $\mathbb{E}\hat{P}_{r}-P_{r}$ and a method of bias reduction as well as a discussion of possible applications to statistical inference in high-dimensional Principal Component Analysis.

Résumé

Soient $X,X_{1},\ldots,X_{n}$ des vecteurs gaussiens à valeurs dans un espace de Hilbert séparable $\mathbb{H}$, i.i.d. et centrés. Nous définissons l’opérateur de covariance $\Sigma=\mathbb{E}(X\otimes X)$ et le rang effectif de $\Sigma $

\[\mathbf{r}(\Sigma):=\frac{\operatorname{tr}(\Sigma)}{\|\Sigma\|_{\infty}}\] où $\operatorname{tr}(\Sigma)$ est la trace of $\Sigma $ et $\|\Sigma\|_{\infty }$ est sa norme d’opérateur. Nous considérons

\[\hat{\Sigma}_{n}:=n^{-1}\sum_{j=1}^{n}(X_{j}\otimes X_{j})\] l’opérateur de covariance empirique construit à partir des observations $(X_{1},\ldots,X_{n})$. Ce papier considère le problème d’estimation des projecteurs spectraux de l’opérateur de covariance $\Sigma $ par les projecteurs spectraux empiriques, c’est-à-dire les projecteurs spectraux de $\hat{\Sigma}_{n}$. Nous nous concentrons sur les problèmes où le nombre d’observations $n$ et le rang effectif $\mathbf{r}(\Sigma)$ sont grands. Ce cadre inclut et généralise les modèles de spiked covariance en grande dimension. Soient $P_{r}$ un projecteur spectral correspondant à une valeur propre $\mu_{r}$ de l’opérateur de covariance $\Sigma $ et $\hat{P}_{r}$ sa version empirique. Nous établissons des bornes de concentrations fines sur les formes bilinéaires du projecteur empirique $\hat{P}_{r}$, qui dépendent du nombre d’observations $n$ et de la dimension effective $\mathbf{r}(\Sigma)$. Nous exploitons ensuite ces bornes de concentration pour établir la normalité asymptotique des formes bilinéaires des opérateurs aléatoires $\hat{P}_{r}-\mathbb{E}\hat{P}_{r}$ sous les hypothèses que $n\to\infty $ et $\mathbf{r}(\Sigma)=o(n)$. Dans le cas particulier des valeurs propres de multiplicité $1$, ces résultats sont reformulés en terme de bornes de concentration et de normalité asymptotique pour les formes linéaires des vecteurs propres empiriques. Nous prouvons aussi de nouveaux résultats sur le biais $\mathbb{E}\hat{P}_{r}-P_{r}$ incluant notamment une méthode de réduction du bias. Finalement, nous discutons des applications possibles de ces résultats à l’inférence statistique en grande dimension pour l’analyse en composantes principales.

Article information

Source
Ann. Inst. H. Poincaré Probab. Statist. Volume 52, Number 4 (2016), 1976-2013.

Dates
Received: 29 August 2014
Revised: 5 July 2015
Accepted: 31 July 2015
First available in Project Euclid: 17 November 2016

Permanent link to this document
https://projecteuclid.org/euclid.aihp/1479373255

Digital Object Identifier
doi:10.1214/15-AIHP705

Mathematical Reviews number (MathSciNet)
MR3573302

Zentralblatt MATH identifier
1353.62053

Subjects
Primary: 62H12: Estimation

Keywords
Sample covariance Spectral projectors Effective rank Principal Component Analysis Concentration inequalities Asymptotic distribution Perturbation theory

Citation

Koltchinskii, Vladimir; Lounici, Karim. Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance. Ann. Inst. H. Poincaré Probab. Statist. 52 (2016), no. 4, 1976--2013. doi:10.1214/15-AIHP705. https://projecteuclid.org/euclid.aihp/1479373255.


Export citation

References

  • [1] A. A. Amini and M. J. Wainwright. High-dimensional analysis of semidefinite relaxations for sparse principal components. Ann. Statist. 37 (5B) (2009) 2877–2921.
  • [2] T. W. Anderson. Asymptotic theory for principal component analysis. Ann. Math. Stat. 34 (1963) 122–148.
  • [3] T. W. Anderson. An Introduction to Multivariate Statistical Analysis, 3rd edition. Wiley Series in Probability and Statistics. Wiley-Interscience, Hoboken, NJ, 2003.
  • [4] J. Baik, G. Ben Arous and S. Péché. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 (5) (2005) 1643–1697.
  • [5] A. Birnbaum, I. M. Johnstone, B. Nadler and D. Paul. Minimax bounds for sparse PCA with noisy high-dimensional data. Ann. Statist. 41 (3) (2013) 1055–1084.
  • [6] G. Blanchard, O. Bousquet and L. Zwald. Statistical properties of kernel principal component analysis. Mach. Learn. 66 (2–3) (2007) 259–294.
  • [7] F. Bunea and L. Xiao. On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA. Bernoulli 21 (2) (2015) 1200–1230.
  • [8] T. T. Cai, Z. Ma and Y. Wu. Sparse PCA: Optimal rates and adaptive estimation. Ann. Statist. 41 (6) (2013) 3074–3110.
  • [9] A. d’Aspremont, L. El Ghaoui, M. I. Jordan and G. R. G. Lanckriet. A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49 (3) (2007) 434–448 (electronic).
  • [10] J. Dauxois, A. Pousse and Y. Romain. Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal. 12 (1) (1982) 136–154.
  • [11] I. M. Johnstone. On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 (2) (2001) 295–327.
  • [12] I. M. Johnstone and A. Y. Lu. On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 (486) (2009) 682–693.
  • [13] I. M. Johnstone and Z. Ma. Fast approach to the Tracy–Widom law at the edge of GOE and GUE. Ann. Appl. Probab. 22 (5) (2012) 1962–1988.
  • [14] T. Kato. Perturbation Theory for Linear Operators. Springer, New York, 1980.
  • [15] A. Kneip and K. J. Utikal. Inference for density families using functional principal component analysis. J. Amer. Statist. Assoc. 96 (454) (2001) 519–532.
  • [16] V. Koltchinskii. Asymptotics of spectral projections of some random matrices approximating integral operators. In High Dimensional Probability (Oberwolfach, 1996) 191–227. Progr. Probab. 43. Birkhäuser, Basel, 1998.
  • [17] V. Koltchinskii and K. Lounici. Normal approximation and concentration of spectral projectors of sample covariance, 2015. Available at arXiv:1504.07333.
  • [18] V. Koltchinskii and K. Lounici. Concentration inequalities and moment bounds for sample covariance operators. Bernoulli. To appear. Available at arXiv:1405.2468.
  • [19] K. Lounici. Sparse principal component analysis with missing observations. In High Dimensional Probability VI 327–356. Prog. Proba., Institute of Mathematical Statistics (IMS) Collections 66, 2013.
  • [20] K. Lounici. High-dimensional covariance matrix estimation with missing observations. Bernoulli 20 (3) (2014) 1029–1058.
  • [21] Z. Ma. Accuracy of the Tracy–Widom limits for the extreme eigenvalues in white Wishart matrices. Bernoulli 18 (1) (2012) 322–359.
  • [22] Z. Ma. Sparse principal component analysis and iterative thresholding. Ann. Statist. 41 (2) (2013) 772–801.
  • [23] V. A. Marčenko and L. A. Pastur. Distribution of eigenvalues in certain sets of random matrices. Mat. Sb. 72 (114) (1967) 507–536.
  • [24] A. Mas and F. Ruymgaart. High dimensional principal projections. Complex Anal. Oper. Theory 9 (1) (2015) 35–63.
  • [25] D. Paul. Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 (4) (2007) 1617–1642.
  • [26] D. Paul and I. M. Johnstone. Augmented sparse principal component analysis for high dimensional data, 2007. Available at arXiv:1202.1242.
  • [27] J. O. Ramsay and B. W. Silverman. Functional Data Analysis. Springer, New York, 1997.
  • [28] F. Riesz and B. Sz.-Nagy. Functional Analysis. Dover, New York, 1990.
  • [29] B. Schölkopf, A. Smola and K. R. Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10 (5) (1998) 1299–1319.
  • [30] R. Vershynin. Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge University Press, Cambridge, 2012.
  • [31] V. Vu and J. Lei. Minimax rates of estimation for sparse PCA in high dimensions. J. Mach. Learn. Res. 22 (2012) 1278–1286.