The spectrum of kernel random matrices

Noureddine El Karoui

doi:10.1214/08-AOS648

February 2010 The spectrum of kernel random matrices

Noureddine El Karoui

Ann. Statist. 38(1): 1-50 (February 2010). DOI: 10.1214/08-AOS648

Abstract

We place ourselves in the setting of high-dimensional statistical inference where the number of variables p in a dataset of interest is of the same order of magnitude as the number of observations n.

We consider the spectrum of certain kernel random matrices, in particular n×n matrices whose (i, j)th entry is f(X'_iX_j/p) or f(‖X_i−X_j‖²/p) where p is the dimension of the data, and X_i are independent data vectors. Here f is assumed to be a locally smooth function.

The study is motivated by questions arising in statistics and computer science where these matrices are used to perform, among other things, nonlinear versions of principal component analysis. Surprisingly, we show that in high-dimensions, and for the models we analyze, the problem becomes essentially linear—which is at odds with heuristics sometimes used to justify the usage of these methods. The analysis also highlights certain peculiarities of models widely studied in random matrix theory and raises some questions about their relevance as tools to model high-dimensional data encountered in practice.

Citation

Download Citation

Noureddine El Karoui. "The spectrum of kernel random matrices." Ann. Statist. 38 (1) 1 - 50, February 2010. https://doi.org/10.1214/08-AOS648

Information

Published: February 2010

First available in Project Euclid: 31 December 2009

zbMATH: 1181.62078

MathSciNet: MR2589315

Digital Object Identifier: 10.1214/08-AOS648

Subjects:

Primary: 62H10

Secondary: 60F99

Keywords: concentration of measure , Covariance matrices , eigenvalues of covariance matrices , Hadamard matrix functions , high-dimensional inference , kernel matrices , machine learning , multivariate statistical analysis , Random matrix theory

Access the abstract

JOURNAL ARTICLE
50 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY