## Electronic Journal of Statistics

### Kernel spectral clustering of large dimensional data

#### Abstract

This article proposes a first analysis of kernel spectral clustering methods in the regime where the dimension $p$ of the data vectors to be clustered and their number $n$ grow large at the same rate. We demonstrate, under a $k$-class Gaussian mixture model, that the normalized Laplacian matrix associated with the kernel matrix asymptotically behaves similar to a so-called spiked random matrix. Some of the isolated eigenvalue-eigenvector pairs in this model are shown to carry the clustering information upon a separability condition classical in spiked matrix models. We evaluate precisely the position of these eigenvalues and the content of the eigenvectors, which unveil important (sometimes quite disruptive) aspects of kernel spectral clustering both from a theoretical and practical standpoints. Our results are then compared to the actual clustering performance of images from the MNIST database, thereby revealing an important match between theory and practice.

#### Article information

Source
Electron. J. Statist., Volume 10, Number 1 (2016), 1393-1454.

Dates
First available in Project Euclid: 31 May 2016

https://projecteuclid.org/euclid.ejs/1464710237

Digital Object Identifier
doi:10.1214/16-EJS1144

Mathematical Reviews number (MathSciNet)
MR3507369

Zentralblatt MATH identifier
06600843

#### Citation

Couillet, Romain; Benaych-Georges, Florent. Kernel spectral clustering of large dimensional data. Electron. J. Statist. 10 (2016), no. 1, 1393--1454. doi:10.1214/16-EJS1144. https://projecteuclid.org/euclid.ejs/1464710237

#### References

• [1] Bai, Z. D., Silverstein, J. W. (1998)., No eigenvalues outside the support of the limiting spectral distribution of large dimensional sample covariance matrices. Ann. Probab. 26 (1), 316–345.
• [2] Baik, J., Ben Arous, G., Péché, S. (2005)., Phase transition of the largest eigenvalue for non-null complex sample covariance matrices. Ann. Prob. 33 (5), 1643–1697.
• [3] Benaych-Georges, F., Couillet, R. (2016)., Spectral analysis of the Gram matrix of mixture models. ESAIM: Probability and Statistics, DOI http://dx.doi.org/10.1051/ps/2016007.
• [4] Benaych-Georges, F., Guionnet, A., Maida, M. (2011)., Fluctuations of the extreme eigenvalues of finite rank deformations of random matrices. Electron. J. Prob. 16 (60), 1621–1662.
• [5] Benaych-Georges, F., Guionnet, A., Maida, M. (2012)., Large deviations of the extreme eigenvalues of random deformations of matrices. Probab. Theory Related Fields 154 (3), 703–751.
• [6] Benaych-Georges, F., Knowles, A. (2016), Lectures on the local semicircle law for Wigner matrices. arXiv. To appear in SMF volume Panoramas et Synthèses.
• [7] Benaych-Georges, F., Nadakuditi, R. R. (2011)., The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 227 (1), 494–521.
• [8] Benaych-Georges, F., Nadakuditi, R. R. (2012)., The singular values and vectors of low rank perturbations of large rectangular random matrices. J. Multivariate Anal. 111, 120–135.
• [9] Bishop, C. M. (2006)., Pattern recognition and machine learning. Information Science and Statistics. Springer.
• [10] Chapon, F., Couillet, R., Hachem, W., Mestre, X. (2014)., The outliers among the singular values of large rectangular random matrices with additive fixed rank deformation. Markov Processes and Related Fields 20, 183–228.
• [11] Couillet, R., Debbah, M., Silverstein, J. W. (2011)., A deterministic equivalent for the analysis of correlated MIMO multiple access channels. IEEE Transactions on Information Theory 57 (6), 3493–3514.
• [12] Couillet, R., Kammoun, A., Statistical subspace kernel spectral clustering of large dimensional data. Under patenting process.
• [13] Couillet, R., Hachem, W. (2013)., Fluctuations of spiked random matrix models and failure diagnosis in sensor networks. IEEE Transactions on Information Theory 59 (1), 509–525.
• [14] Couillet, R., Hachem, W. (2014)., Analysis of the limit spectral measure of large random matrices of the separable covariance type. Random Matrix Theory and Applications 3 (4), 1–23.
• [15] Couillet, R., Pascal, F., Silverstein, J. W. (2015)., The random matrix regime of Maronna’s M-estimator with elliptically distributed samples. J. Multivariate Anal. 139, 56–78.
• [16] El Karoui, N. (2009)., Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond. The Annals of Applied Probability 19 (6), 2362–2405.
• [17] El Karoui, N. (2010)., The spectrum of kernel random matrices. The Annals of Statistics 38 (1), 1–50.
• [18] Erdős, L. (2011)., Universality of wigner random matrices: A survey of recent results. Russian Mathematical Surveys 66 (3), 507.
• [19] Hachem, W., Loubaton, P., Mestre, X., Najim, J., Vallet, P. (2013)., A subspace estimator for fixed rank perturbations of large random matrices. J. Multivariate Anal. 114, 427–447.
• [20] Hanson, D. L., Wright, F. T. (1971)., A bound on tail probabilities for quadratic forms in independent random variables. The Annals of Mathematical Statistics, 1079–1083.
• [21] Horn, R. C. R. Johnson, C. (1985)., Matrix Analysis. Cambridge University Press.
• [22] Johnstone, I.M. (2001)., On the distribution of the largest Principal Component. Ann. Statist. 29, 295–327.
• [23] LeCun, Y., Cortes, C., Burges, C. (1998)., The mnist database of handwritten digits.
• [24] Marčenko, V. A., Pastur, L. A. (1967)., Distribution of eigenvalues for some sets of random matrices. Math USSR-Sbornik 1 (4), 457–483.
• [25] Ng, A. Y., Jordan, M., Weiss, Y. (2001)., On spectral clustering: Analysis and an algorithm. Proceedings of Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press 14, 849–856.
• [26] Pastur, L., Ŝerbina, M. (2011)., Eigenvalue distribution of large random matrices. Mathematical Surveys and Monographs, 171. American Mathematical Society.
• [27] Paul, D. (2007)., Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 (4), 1617–1642.
• [28] Rudelson, M., Vershynin, R. (2013)., Hanson-wright inequality and sub-gaussian concentration. Electron. Commun. Probab 18 (0).
• [29] Silverstein, J. W., Bai, Z. D. (1995)., On the empirical distribution of eigenvalues of a class of large dimensional random matrices. J. Multivariate Anal. 54 (2), 175–192.
• [30] Silverstein, J. W., Choi, S. (1995)., Analysis of the limiting spectral distribution of large dimensional random matrices. J. Multivariate Anal. 54 (2), 295–309.
• [31] Von Luxburg, U. (2007)., A tutorial on spectral clustering. Statistics and computing 17 (4), 395–416.
• [32] Von Luxburg, U., Belkin, M., Bousquet, O. (2008)., Consistency of spectral clustering. The Annals of Statistics, 555–586.
• [33] Wagner, S., Couillet, R., Debbah, M., Slock, D. T. M. (2012)., Large system analysis of linear precoding in MISO broadcast channels with limited feedback. IEEE Transactions on Information Theory 58 (7), 4509–4537.