## The Annals of Statistics

### Spectrum estimation from samples

#### Abstract

We consider the problem of approximating the set of eigenvalues of the covariance matrix of a multivariate distribution (equivalently, the problem of approximating the “population spectrum”), given access to samples drawn from the distribution. We consider this recovery problem in the regime where the sample size is comparable to, or even sublinear in the dimensionality of the distribution. First, we propose a theoretically optimal and computationally efficient algorithm for recovering the moments of the eigenvalues of the population covariance matrix. We then leverage this accurate moment recovery, via a Wasserstein distance argument, to accurately reconstruct the vector of eigenvalues. Together, this yields an eigenvalue reconstruction algorithm that is asymptotically consistent as the dimensionality of the distribution and sample size tend toward infinity, even in the sublinear sample regime where the ratio of the sample size to the dimensionality tends to zero. In addition to our theoretical results, we show that our approach performs well in practice for a broad range of distributions and sample sizes.

#### Article information

Source
Ann. Statist., Volume 45, Number 5 (2017), 2218-2247.

Dates
Revised: October 2016
First available in Project Euclid: 31 October 2017

https://projecteuclid.org/euclid.aos/1509436833

Digital Object Identifier
doi:10.1214/16-AOS1525

Mathematical Reviews number (MathSciNet)
MR3718167

Zentralblatt MATH identifier
06821124

Subjects
Primary: 62H12: Estimation 62H10: Distribution of statistics

#### Citation

Kong, Weihao; Valiant, Gregory. Spectrum estimation from samples. Ann. Statist. 45 (2017), no. 5, 2218--2247. doi:10.1214/16-AOS1525. https://projecteuclid.org/euclid.aos/1509436833

#### References

• [1] Alon, N., Yuster, R. and Zwick, U. (1997). Finding and counting given length cycles. Algorithmica 17 209–223.
• [2] Anderson, T. W. (1963). Asymptotic theory for principal component analysis. Ann. Math. Stat. 34 122–148.
• [3] Bai, Z., Chen, J. and Yao, J. (2010). On estimation of the population spectral distribution from a high-dimensional sample covariance matrix. Aust. N. Z. J. Stat. 52 423–437.
• [4] Bai, Z. and Silverstein, J. W. (2010). Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer, New York.
• [5] Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
• [6] Burda, Z., Görlich, A., Jarosz, A. and Jurkiewicz, J. (2004). Signal and noise in correlation matrix. Phys. A 343 295–310.
• [7] Burda, Z., Jurkiewicz, J. and Wacław, B. (2005). Spectral moments of correlated Wishart matrices. Phys. Rev. E (3) 71 026111.
• [8] Carothers, N. (2000). A short course on approximation theory.
• [9] de Villiers, J. (2012). Mathematics of Approximation. Mathematics Textbooks for Science and Engineering 1. Atlantis Press, Paris.
• [10] Dey, D. K. and Srinivasan, C. (1985). Estimation of a covariance matrix under Stein’s loss. Ann. Statist. 13 1581–1591.
• [11] Donoho, D. L., Gavish, M. and Johnstone, I. M. (2013). Optimal shrinkage of eigenvalues in the spiked covariance model. Preprint. Available at arXiv:1311.0851.
• [12] Efron, B. and Morris, C. (1976). Multivariate empirical Bayes and estimation of covariance matrices. Ann. Statist. 4 22–32.
• [13] El Karoui, N. (2008). Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Statist. 36 2757–2790.
• [14] Haff, L. R. (1979). An identity for the Wishart distribution with applications. J. Multivariate Anal. 9 531–544.
• [15] Haff, L. R. (1980). Empirical Bayes estimation of the multivariate normal covariance matrix. Ann. Statist. 8 586–597.
• [16] Hardt, M., Ligett, K. and McSherry, F. (2012). A simple and practical algorithm for differentially private data release. In Advances in Neural Information Processing Systems 2339–2347.
• [17] James, W. and Stein, C. (1961). Estimation with quadratic loss. In Proc. 4th Berkeley Sympos. Math. Statist. and Prob., Vol. I 361–379. Univ. California Press, Berkeley, Calif.
• [18] Kane, D. M., Nelson, J. and Woodruff, D. P. (2010). On the exact space complexity of sketching and streaming small norms. In Proceedings of the Twenty-First Annual ACM–SIAM Symposium on Discrete Algorithms 1161–1178. SIAM, Philadelphia, PA.
• [19] Kantorovič, L. V. and Rubinšteĭn, G. Š. (1957). On a functional space and certain extremum problems. Dokl. Akad. Nauk SSSR (N.S.) 115 1058–1061.
• [20] Kong, W. and Valiant, G. (2017). Supplement to “Spectrum estimation from samples.” DOI:10.1214/16-AOS1525SUPP.
• [21] Ledoit, O. and Péché, S. (2011). Eigenvectors of some large sample covariance matrix ensembles. Probab. Theory Related Fields 151 233–264.
• [22] Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88 365–411.
• [23] Ledoit, O. and Wolf, M. (2012). Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Statist. 40 1024–1060.
• [24] Ledoit, O. and Wolf, M. (2013). Spectrum estimation: A unified framework for covariance matrix estimation and PCA in large dimensions. Available at SSRN 2198287.
• [25] Li, W., Chen, J., Qin, Y., Bai, Z. and Yao, J. (2013). Estimation of the population spectral distribution from a large dimensional sample covariance matrix. J. Statist. Plann. Inference 143 1887–1897.
• [26] Li, W. and Yao, J. (2014). A local moment estimator of the spectrum of a large dimensional covariance matrix. Statist. Sinica 24 919–936.
• [27] Li, Y., Nguyen, H. L. and Woodruff, D. P. (2014). On sketching matrix norms and the top singular vector. In Proceedings of the Twenty-Fifth Annual ACM–SIAM Symposium on Discrete Algorithms 1562–1581. ACM, New York.
• [28] Mahoney, M. W. (2011). Randomized algorithms for matrices and data. Found. Trends Mach. Learn. 3 123–224.
• [29] Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues for some sets of random matrices. Sb. Math. 1 457–483.
• [30] Mestre, X. (2008). Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates. IEEE Trans. Inform. Theory 54 5113–5129.
• [31] Rao, N. R., Mingo, J. A., Speicher, R. and Edelman, A. (2008). Statistical eigen-inference from large Wishart matrices. Ann. Statist. 36 2850–2885.
• [32] Schäfer, J. and Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4 Art. 32, 28.
• [33] Silverstein, J. W. (1995). Strong convergence of the empirical distribution of eigenvalues of large dimensional random matrices. J. Multivariate Anal. 55 331–339.
• [34] Stein, C. (1975). Estimation of a covariance matrix. Rietz Lecture, 39th Annual IMS Meeting, Atlanta, GA.
• [35] Stein, C. (1977). Lectures on the theory of estimation of many parameters. In Studies in the Statistical Theory of Estimation I. Proceedings of Scientific Seminars of the Steklov Institute, Leningrad Division 74 4–65.
• [36] Takemura, A. (1984). An orthogonally invariant minimax estimator of the covariance matrix of a multivariate normal population. Tsukuba J. Math. 8 367–376.
• [37] Valiant, G. and Valiant, P. (2011). Estimating the unseen: An $n/\log (n)$-sample estimator for entropy and support size, shown optimal via new CLTs. In Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing 685–694. ACM, New York.
• [38] Yin, Y. Q. and Krishnaiah, P. R. (1983). A limit theorem for the eigenvalues of product of two random matrices. J. Multivariate Anal. 13 489–507.