Annals of Statistics

The spectrum of kernel random matrices

Noureddine El Karoui

Full-text: Open access

Abstract

We place ourselves in the setting of high-dimensional statistical inference where the number of variables p in a dataset of interest is of the same order of magnitude as the number of observations n.

We consider the spectrum of certain kernel random matrices, in particular n×n matrices whose (i, j)th entry is f(X'iXj/p) or f(‖XiXj2/p) where p is the dimension of the data, and Xi are independent data vectors. Here f is assumed to be a locally smooth function.

The study is motivated by questions arising in statistics and computer science where these matrices are used to perform, among other things, nonlinear versions of principal component analysis. Surprisingly, we show that in high-dimensions, and for the models we analyze, the problem becomes essentially linear—which is at odds with heuristics sometimes used to justify the usage of these methods. The analysis also highlights certain peculiarities of models widely studied in random matrix theory and raises some questions about their relevance as tools to model high-dimensional data encountered in practice.

Article information

Source
Ann. Statist., Volume 38, Number 1 (2010), 1-50.

Dates
First available in Project Euclid: 31 December 2009

Permanent link to this document
https://projecteuclid.org/euclid.aos/1262271608

Digital Object Identifier
doi:10.1214/08-AOS648

Mathematical Reviews number (MathSciNet)
MR2589315

Zentralblatt MATH identifier
1181.62078

Subjects
Primary: 62H10: Distribution of statistics
Secondary: 60F99: None of the above, but in this section

Keywords
Covariance matrices kernel matrices eigenvalues of covariance matrices multivariate statistical analysis high-dimensional inference random matrix theory machine learning Hadamard matrix functions concentration of measure

Citation

El Karoui, Noureddine. The spectrum of kernel random matrices. Ann. Statist. 38 (2010), no. 1, 1--50. doi:10.1214/08-AOS648. https://projecteuclid.org/euclid.aos/1262271608


Export citation

References

  • [1] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
  • [2] Bach, F. R. and Jordan, M. I. (2003). Kernel independent component analysis. J. Mach. Learn. Res. 3 1–48.
  • [3] Bai, Z. D. (1999). Methodologies in spectral analysis of large-dimensional random matrices, a review. Statist. Sinica 9 611–677.
  • [4] Bai, Z. D., Miao, B. Q. and Pan, G. M. (2007). On asymptotics of eigenvectors of large sample covariance matrix. Ann. Probab. 35 1532–1572.
  • [5] Bai, Z. D. and Silverstein, J. W. (1998). No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices. Ann. Probab. 26 316–345.
  • [6] Bai, Z. D. and Silverstein, J. W. (1999). Exact separation of eigenvalues of large-dimensional sample covariance matrices. Ann. Probab. 27 1536–1555.
  • [7] Baik, J., Ben Arous, G. and Péché, S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 1643–1697.
  • [8] Baik, J. and Silverstein, J. (2006). Eigenvalues of large sample covariance matrices of spiked population models. J. Multivariate Anal. 97 1382–1408.
  • [9] Belkin, M. and Niyogi, P. (2009). Convergence of Laplacian eigenmaps. Preprint.
  • [10] Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics 169. Springer, New York.
  • [11] Bogomolny, E., Bohigas, O. and Schmit, C. (2003). Spectral properties of distance matrices. J. Phys. A 36 3595–3616.
  • [12] Bordenave, C. (2008). Eigenvalues of Euclidean random matrices. Random Structures Algorithms 33 515–532. Available at http://arxiv.org/abs/math/0606624.
  • [13] Boutet de Monvel, A., Khorunzhy, A. and Vasilchuk, V. (1996). Limiting eigenvalue distribution of random matrices with correlated entries. Markov Process. Related Fields 2 607–636.
  • [14] Burda, Z., Jurkiewicz, J. and Wacław, B. (2005). Spectral moments of correlated Wishart matrices. Phys. Rev. E 71 026111.
  • [15] Cressie, N. A. C. (1993). Statistics for Spatial Data. Wiley, New York.
  • [16] El Karoui, N. (2003). On the largest eigenvalue of Wishart matrices with identity covariance when n, p and p/n→∞. Available at arXiv:math.ST/0309355.
  • [17] El Karoui, N. (2007). Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Probab. 35 663–714.
  • [18] El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
  • [19] El Karoui, N. (2009). Concentration of measure and spectra of random matrices: With applications to correlation matrices, elliptical distributions and beyond. Ann. Appl. Probab. 19 2362–2405.
  • [20] Forrester, P. J. (1993). The spectrum edge of random matrix ensembles. Nuclear Phys. B 402 709–728.
  • [21] Geman, S. (1980). A limit theorem for the norm of random matrices. Ann. Probab. 8 252–261.
  • [22] Geronimo, J. S. and Hill, T. P. (2003). Necessary and sufficient condition that the limit of Stieltjes transforms is a Stieltjes transform. J. Approx. Theory 121 54–60.
  • [23] Gohberg, I., Goldberg, S. and Krupnik, N. (2000). Traces and Determinants of Linear Operators. Operator Theory: Advances and Applications. 116 Birkhäuser, Basel.
  • [24] Horn, R. A. and Johnson, C. R. (1990). Matrix Analysis. Cambridge Univ. Press, Cambridge.
  • [25] Horn, R. A. and Johnson, C. R. (1994). Topics in Matrix Analysis. Cambridge Univ. Press, Cambridge.
  • [26] Johansson, K. (2000). Shape fluctuations and random matrices. Comm. Math. Phys. 209 437–476.
  • [27] Johnstone, I. (2001). On the distribution of the largest eigenvalue in principal component analysis. Ann. Statist. 29 295–327.
  • [28] Koltchinskii, V. and Giné, E. (2000). Random matrix approximation of spectra of integral operators. Bernoulli 6 113–167.
  • [29] Ledoux, M. (2001). The concentration of measure phenomenon. Mathematical Surveys and Monographs 89. Amer. Math. Soc., Providence, RI.
  • [30] Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues in certain sets of random matrices. Mat. Sb. (N.S.) 72 507–536.
  • [31] Paul, D. (2007). Asymptotics of sample eigenstructure for a large-dimensional spiked covariance model. Statist. Sinica 17 1617–1642.
  • [32] Paul, D. and Silverstein, J. (2009). No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix. J. Multivariate Anal. 100 37–57.
  • [33] Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA.
  • [34] Schechtman, G. and Zinn, J. (2000). Concentration on the lpn ball. In Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics 1745 245–256. Springer, Berlin.
  • [35] Schölkopf, B. and Smola, A. J. (2002). Learning with Kernels. MIT Press, Cambridge, MA.
  • [36] Schölkopf, B., Tsuda, K. and Vert, J. P. (2004). Kernel Methods in Computational Biology. MIT Press, Cambridge, MA.
  • [37] Silverstein, J. W. (1995). Strong convergence of the empirical distribution of eigenvalues of large-dimensional random matrices. J. Multivariate Anal. 55 331–339.
  • [38] Tracy, C. and Widom, H. (1994). Level-spacing distribution and the Airy kernel. Comm. Math. Phys. 159 151–174.
  • [39] Tracy, C. and Widom, H. (1996). On orthogonal and symplectic matrix ensembles. Comm. Math. Phys. 177 727–754.
  • [40] Tracy, C. and Widom, H. (1998). Correlation functions, cluster functions and spacing distributions for random matrices. J. Stat. Phys. 92 809–835.
  • [41] Voiculescu, D. (2000). Lectures on free probability theory. In Lectures on Probability Theory and Statistics (Saint-Flour, 1998). Lecture Notes in Mathematics 1738 279–349. Springer, Berlin.
  • [42] Wachter, K. W. (1978). The strong limits of random matrix spectra for sample matrices of independent elements. Ann. Probab. 6 1–18.
  • [43] Wigner, E. (1955). Characteristic vectors of bordered matrices with infinite dimensions. Ann. of Math. (2) 62 548–564.
  • [44] Williams, C. and Seeger, M. (2000). The effect of the input density distribution on kernel-based classifiers. International Conference on Machine Learning 17 1159–1166.
  • [45] Yin, Y. Q., Bai, Z. D. and Krishnaiah, P. R. (1988). On the limit of the largest eigenvalue of the large-dimensional sample covariance matrix. Probab. Theory Related Fields 78 509–521.