Annals of Applied Probability

Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond

Noureddine El Karoui

Full-text: Open access

Abstract

We place ourselves in the setting of high-dimensional statistical inference, where the number of variables p in a data set of interest is of the same order of magnitude as the number of observations n. More formally, we study the asymptotic properties of correlation and covariance matrices, in the setting where p/nρ∈(0, ∞), for general population covariance.

We show that, for a large class of models studied in random matrix theory, spectral properties of large-dimensional correlation matrices are similar to those of large-dimensional covarance matrices.

We also derive a Marčenko–Pastur-type system of equations for the limiting spectral distribution of covariance matrices computed from data with elliptical distributions and generalizations of this family. The motivation for this study comes partly from the possible relevance of such distributional assumptions to problems in econometrics and portfolio optimization, as well as robustness questions for certain classical random matrix results.

A mathematical theme of the paper is the important use we make of concentration inequalities.

Article information

Source
Ann. Appl. Probab., Volume 19, Number 6 (2009), 2362-2405.

Dates
First available in Project Euclid: 25 November 2009

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1259158775

Digital Object Identifier
doi:10.1214/08-AAP548

Mathematical Reviews number (MathSciNet)
MR2588248

Zentralblatt MATH identifier
1255.62156

Subjects
Primary: 62H10: Distribution of statistics

Keywords
Covariance matrices correlation matrices eigenvalues of covariance matrices multivariate statistical analysis high-dimensional inference random matrix theory elliptical distributions concentration of measure

Citation

El Karoui, Noureddine. Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond. Ann. Appl. Probab. 19 (2009), no. 6, 2362--2405. doi:10.1214/08-AAP548. https://projecteuclid.org/euclid.aoap/1259158775


Export citation

References

  • [1] Anderson, G. W. and Zeitouni, O. (2006). A CLT for a band matrix model. Probab. Theory Related Fields 134 283–338.
  • [2] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ.
  • [3] Bai, Z. D. (1999). Methodologies in spectral analysis of large-dimensional random matrices, a review. Statist. Sinica 9 611–677.
  • [4] Bai, Z. D. and Silverstein, J. W. (1998). No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices. Ann. Probab. 26 316–345.
  • [5] Bai, Z. D. and Silverstein, J. W. (2004). CLT for linear spectral statistics of large-dimensional sample covariance matrices. Ann. Probab. 32 553–605.
  • [6] Baik, J., Ben Arous, G. and Péché, S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 1643–1697.
  • [7] Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics 169. Springer, New York.
  • [8] Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
  • [9] Boutet de Monvel, A., Khorunzhy, A. and Vasilchuk, V. (1996). Limiting eigenvalue distribution of random matrices with correlated entries. Markov Process. Related Fields 2 607–636.
  • [10] Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
  • [11] Burda, Z., Görlich, A., Jarosz, A. and Jurkiewicz, J. (2004). Signal and noise in correlation matrix. Phys. A 343 295–310.
  • [12] Burda, Z., Jurkiewicz, J. and Wacław, B. (2005). Spectral moments of correlated Wishart matrices. Phys. Rev. E 71.
  • [13] Campbell, J., Lo, A. and MacKinlay, C. (1996). The Econometrics of Financial Markets. Princeton Univ. Press, Princeton, NJ.
  • [14] El Karoui, N. (2003). On the largest eigenvalue of Wishart matrices with identity covariance when n, p and p/n→∞. Available at arXiv:math.ST/0309355.
  • [15] El Karoui, N. (2007a). The spectrum of kernel random matrices. Technical Report 748, Dept. Statistics, UC Berkeley. Ann. Statist. To appear.
  • [16] El Karoui, N. (2007). Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Probab. 35 663–714.
  • [17] El Karoui, N. (2009). Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Statist. To appear. Available at arXiv:math.ST/0609418.
  • [18] Fang, K. T., Kotz, S. and Ng, K. W. (1990). Symmetric Multivariate and Related Distributions. Monographs on Statistics and Applied Probability 36. Chapman & Hall, London.
  • [19] Forrester, P. J. (1993). The spectrum edge of random matrix ensembles. Nuclear Phys. B 402 709–728.
  • [20] Frahm, G. and Jaekel, U. (2005). Random matrix theory and robust covariance matrix estimation for financial data. Available at arXiv:physics/0503007.
  • [21] Geman, S. (1980). A limit theorem for the norm of random matrices. Ann. Probab. 8 252–261.
  • [22] Geronimo, J. S. and Hill, T. P. (2003). Necessary and sufficient condition that the limit of Stieltjes transforms is a Stieltjes transform. J. Approx. Theory 121 54–60.
  • [23] Girko, V. L. (1990). Theory of Random Determinants. Mathematics and Its Applications (Soviet Series) 45. Kluwer Academic, Dordrecht.
  • [24] Gray, R. M. (2002). Toeplitz and circulant matrices: A review. Available at http://ee.stanford.edu/~gray/toeplitz.pdf.
  • [25] Grenander, U. and Szegö, G. (1958). Toeplitz Forms and Their Applications. Univ. California Press, Berkeley.
  • [26] Guionnet, A. and Zeitouni, O. (2000). Concentration of the spectral measure for large matrices. Electron. Comm. Probab. 5 119–136 (electronic).
  • [27] Jiang, T. (2004). The limiting distributions of eigenvalues of sample correlation matrices. Sankhyā 66 35–48.
  • [28] Johansson, K. (2000). Shape fluctuations and random matrices. Comm. Math. Phys. 209 437–476.
  • [29] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • [30] Jonsson, D. (1982). Some limit theorems for the eigenvalues of a sample covariance matrix. J. Multivariate Anal. 12 1–38.
  • [31] Laloux, L., Cizeau, P., Bouchaud, J.-P. and Potters, M. (1999). Random matrix theory and financial correlations. Int. J. Theor. Appl. Finance 3 391–397.
  • [32] Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88 365–411.
  • [33] Ledoux, M. (2001). The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs 89. Amer. Math. Soc., Providence, RI.
  • [34] Li, L., Tulino, A. M. and Verdú, S. (2004). Design of reduced-rank MMSE multiuser detectors using random matrix methods. IEEE Trans. Inform. Theory 50 986–1008.
  • [35] Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues in certain sets of random matrices. Mat. Sb. (N.S.) 72 507–536.
  • [36] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.
  • [37] McNeil, A. J., Frey, R. and Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools. Princeton Univ. Press, Princeton, NJ.
  • [38] Nelsen, R. B. (2006). An Introduction to Copulas, 2nd ed. Springer, New York.
  • [39] Pajor, A. and Pastur, L. (2007). On the limiting empirical measure of the sum of rank one matrices with log-concave distribution. Available at http://www.arxiv.org/abs/0710.1346.
  • [40] Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 1617–1642.
  • [41] Paul, D. and Silverstein, J. (2007). No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix. Available at http://www4.ncsu.edu/~jack/pub.html.
  • [42] Rao, N. R., Mingo, J., Speicher, R. and Edelman, A. (2007). Statistical eigen-inference from large Wishart matrices. Available at arXiv:math/0701314.
  • [43] Schechtman, G. and Zinn, J. (2000). Concentration on the lpn ball. In Geometric Aspects of Functional Analysis. Lecture Notes in Math. 1745 245–256. Springer, Berlin.
  • [44] Silverstein, J. W. (1995). Strong convergence of the empirical distribution of eigenvalues of large-dimensional random matrices. J. Multivariate Anal. 55 331–339.
  • [45] Silverstein, J. W. and Bai, Z. D. (1995). On the empirical distribution of eigenvalues of a class of large-dimensional random matrices. J. Multivariate Anal. 54 175–192.
  • [46] Talagrand, M. (1995). Concentration of measure and isoperimetric inequalities in product spaces. Inst. Hautes Études Sci. Publ. Math. 81 73–205.
  • [47] Tulino, A. and Verdú, S. (2004). Random Matrix Theory and Wireless Communications. Foundations and Trends in Communications and Information Theory 1. Now Publishers, Boston.
  • [48] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
  • [49] Voiculescu, D. (2000). Lectures on free probability theory. In Lectures on Probability Theory and Statistics (Saint-Flour, 1998). Lecture Notes in Math. 1738 279–349. Springer, Berlin.
  • [50] Yin, Y. Q., Bai, Z. D. and Krishnaiah, P. R. (1988). On the limit of the largest eigenvalue of the large-dimensional sample covariance matrix. Probab. Theory Related Fields 78 509–521.
  • [51] Zhang, L. (2006). Spectral analysis of large dimensional random matrices. Ph.D. thesis, National Univ. Singapore.