Annals of Probability

Covariance estimation for distributions with ${2+\varepsilon}$ moments

Nikhil Srivastava and Roman Vershynin

We study the minimal sample size $N=N(n)$ that suffices to estimate the covariance matrix of an $n$-dimensional distribution by the sample covariance matrix in the operator norm, with an arbitrary fixed accuracy. We establish the optimal bound $N=O(n)$ for every distribution whose $k$-dimensional marginals have uniformly bounded $2+\varepsilon$ moments outside the sphere of radius $O(\sqrt{k})$. In the specific case of log-concave distributions, this result provides an alternative approach to the Kannan–Lovasz–Simonovits problem, which was recently solved by Adamczak et al. [J. Amer. Math. Soc. 23 (2010) 535–561]. Moreover, a lower estimate on the covariance matrix holds under a weaker assumption—uniformly bounded $2+\varepsilon$ moments of one-dimensional marginals. Our argument consists of randomizing the spectral sparsifier, a deterministic tool developed recently by Batson, Spielman and Srivastava [SIAM J. Comput. 41 (2012) 1704–1721]. The new randomized method allows one to control the spectral edges of the sample covariance matrix via the Stieltjes transform evaluated at carefully chosen random points.

Article information

Ann. Probab., Volume 41, Number 5 (2013), 3081-3111.

First available in Project Euclid: 12 September 2013

Primary: 62H12: Estimation
Secondary: 60B20: Random matrices (probabilistic aspects; for algebraic aspects see 15B52)

Covariance matrices high-dimensional distributions Stieltjes transform log-concave distributions random matrices


Srivastava, Nikhil; Vershynin, Roman. Covariance estimation for distributions with ${2+\varepsilon}$ moments. Ann. Probab. 41 (2013), no. 5, 3081--3111. doi:10.1214/12-AOP760.

