Open Access
February 2021 Concentration of kernel matrices with application to kernel spectral clustering
Arash A. Amini, Zahra S. Razaee
Ann. Statist. 49(1): 531-556 (February 2021). DOI: 10.1214/20-AOS1967

Abstract

We study the concentration of random kernel matrices around their mean. We derive nonasymptotic exponential concentration inequalities for Lipschitz kernels assuming that the data points are independent draws from a class of multivariate distributions on $\mathbb{R}^{d}$, including the strongly log-concave distributions under affine transformations. A feature of our result is that the data points need not have identical distributions or zero mean, which is key in certain applications such as clustering. Our bound for the Lipschitz kernels is dimension-free and sharp up to constants. For comparison, we also derive the companion result for the Euclidean (inner product) kernel for a class of sub-Gaussian distributions. A notable difference between the two cases is that, in contrast to the Euclidean kernel, in the Lipschitz case, the concentration inequality does not depend on the mean of the underlying vectors. As an application of these inequalities, we derive a bound on the misclassification rate of a kernel spectral clustering (KSC) algorithm, under a perturbed nonparametric mixture model. We show an example where this bound establishes the high-dimensional consistency (as $d\to \infty $) of the KSC, when applied with a Gaussian kernel, to a noisy model of nested nonlinear manifolds.

Citation

Download Citation

Arash A. Amini. Zahra S. Razaee. "Concentration of kernel matrices with application to kernel spectral clustering." Ann. Statist. 49 (1) 531 - 556, February 2021. https://doi.org/10.1214/20-AOS1967

Information

Received: 1 September 2019; Revised: 1 January 2020; Published: February 2021
First available in Project Euclid: 29 January 2021

Digital Object Identifier: 10.1214/20-AOS1967

Subjects:
Primary: 60E15 , 62G99 , 62H20

Keywords: Concentration inequalities , kernel matrices , kernel spectral clustering , nonasymptotic bounds

Rights: Copyright © 2021 Institute of Mathematical Statistics

Vol.49 • No. 1 • February 2021
Back to Top