## Annals of Probability

### Covariance estimation for distributions with ${2+\varepsilon}$ moments

#### Abstract

We study the minimal sample size $N=N(n)$ that suffices to estimate the covariance matrix of an $n$-dimensional distribution by the sample covariance matrix in the operator norm, with an arbitrary fixed accuracy. We establish the optimal bound $N=O(n)$ for every distribution whose $k$-dimensional marginals have uniformly bounded $2+\varepsilon$ moments outside the sphere of radius $O(\sqrt{k})$. In the specific case of log-concave distributions, this result provides an alternative approach to the Kannan–Lovasz–Simonovits problem, which was recently solved by Adamczak et al. [J. Amer. Math. Soc. 23 (2010) 535–561]. Moreover, a lower estimate on the covariance matrix holds under a weaker assumption—uniformly bounded $2+\varepsilon$ moments of one-dimensional marginals. Our argument consists of randomizing the spectral sparsifier, a deterministic tool developed recently by Batson, Spielman and Srivastava [SIAM J. Comput. 41 (2012) 1704–1721]. The new randomized method allows one to control the spectral edges of the sample covariance matrix via the Stieltjes transform evaluated at carefully chosen random points.

#### Article information

Source
Ann. Probab., Volume 41, Number 5 (2013), 3081-3111.

Dates
First available in Project Euclid: 12 September 2013

https://projecteuclid.org/euclid.aop/1378991832

Digital Object Identifier
doi:10.1214/12-AOP760

Mathematical Reviews number (MathSciNet)
MR3127875

Zentralblatt MATH identifier
1293.62121

Subjects
Primary: 62H12: Estimation
Secondary: 60B20: Random matrices (probabilistic aspects; for algebraic aspects see 15B52)

#### Citation

Srivastava, Nikhil; Vershynin, Roman. Covariance estimation for distributions with ${2+\varepsilon}$ moments. Ann. Probab. 41 (2013), no. 5, 3081--3111. doi:10.1214/12-AOP760. https://projecteuclid.org/euclid.aop/1378991832

#### References

• [1] Adamczak, R., Litvak, A. E., Pajor, A. and Tomczak-Jaegermann, N. (2010). Quantitative estimates of the convergence of the empirical covariance matrix in log-concave ensembles. J. Amer. Math. Soc. 23 535–561.
• [2] Adamczak, R., Litvak, A. E., Pajor, A. and Tomczak-Jaegermann, N. (2011). Sharp bounds on the rate of convergence of the empirical covariance matrix. C. R. Math. Acad. Sci. Paris 349 195–200.
• [3] Bai, Z. D., Silverstein, J. W. and Yin, Y. Q. (1988). A note on the largest eigenvalue of a large-dimensional sample covariance matrix. J. Multivariate Anal. 26 166–168.
• [4] Bai, Z. D. and Yin, Y. Q. (1993). Limit of the smallest eigenvalue of a large-dimensional sample covariance matrix. Ann. Probab. 21 1275–1294.
• [5] Batson, J. D., Spielman, D. A. and Srivastava, N. (2012). Twice-Ramanujan sparsifiers. SIAM J. Comput. 41 1704–1721.
• [6] Benaych-Georges, F. and Nadakuditi, R. R. (2011). The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 227 494–521.
• [7] de la Peña, V. H. and Giné, E. (1999). Decoupling: From Dependence to Independence: Randomly Stopped Processes $U$-Statistics and Processes Martingales and Beyond. Springer, New York.
• [8] Figiel, T., Hitczenko, P., Johnson, W. B., Schechtman, G. and Zinn, J. (1997). Extremal properties of Rademacher functions with applications to the Khintchine and Rosenthal inequalities. Trans. Amer. Math. Soc. 349 997–1027.
• [9] Kannan, R., Lovász, L. and Simonovits, M. (1997). Random walks and an $O^{*}(n^{5})$ volume algorithm for convex bodies. Random Structures Algorithms 11 1–50.
• [10] Latala, R. (2005). Some estimates of norms of random matrices. Proc. Amer. Math. Soc. 133 1273–1282 (electronic).
• [11] Paouris, G. (2006). Concentration of mass on convex bodies. Geom. Funct. Anal. 16 1021–1049.
• [12] Rosenthal, H. P. (1970). On the subspaces of $L^{p}$ ($p>2$) spanned by sequences of independent random variables. Israel J. Math. 8 273–303.
• [13] Rudelson, M. (1999). Random vectors in the isotropic position. J. Funct. Anal. 164 60–72.
• [14] Srivastava, N. (2010). Spectral sparsification and restricted invertibility. Ph.D. thesis, Yale Univ.
• [15] Vershynin, R. (2011). A simple decoupling inequality in probability theory. Available at http://www-personal.umich.edu/~romanv/papers/decoupling-simple.pdf.
• [16] Vershynin, R. (2012). How close is the sample covariance matrix to the actual covariance matrix? J. Theoret. Probab. 25 655–686.
• [17] Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing: Theory and Applications (Y. Eldar and G. Kutyniok, eds.) 210–268. Cambridge Univ. Press, Cambridge.