Open Access
2011 Spectral clustering based on local linear approximations
Ery Arias-Castro, Guangliang Chen, Gilad Lerman
Electron. J. Statist. 5: 1537-1587 (2011). DOI: 10.1214/11-EJS651

Abstract

In the context of clustering, we assume a generative model where each cluster is the result of sampling points in the neighborhood of an embedded smooth surface; the sample may be contaminated with outliers, which are modeled as points sampled in space away from the clusters. We consider a prototype for a higher-order spectral clustering method based on the residual from a local linear approximation. We obtain theoretical guarantees for this algorithm and show that, in terms of both separation and robustness to outliers, it outperforms the standard spectral clustering algorithm (based on pairwise distances) of Ng, Jordan and Weiss (NIPS ’01). The optimal choice for some of the tuning parameters depends on the dimension and thickness of the clusters. We provide estimators that come close enough for our theoretical purposes. We also discuss the cases of clusters of mixed dimensions and of clusters that are generated from smoother surfaces. In our experiments, this algorithm is shown to outperform pairwise spectral clustering on both simulated and real data.

Citation

Download Citation

Ery Arias-Castro. Guangliang Chen. Gilad Lerman. "Spectral clustering based on local linear approximations." Electron. J. Statist. 5 1537 - 1587, 2011. https://doi.org/10.1214/11-EJS651

Information

Published: 2011
First available in Project Euclid: 23 November 2011

zbMATH: 1271.62132
MathSciNet: MR2861697
Digital Object Identifier: 10.1214/11-EJS651

Subjects:
Primary: 62G20 , 62H30 , 68T10

Keywords: detection of clusters in point clouds , dimension estimation , higher-order affinities , local linear approximation , local polynomial approximation , nearest-neighbor search , spectral clustering

Rights: Copyright © 2011 The Institute of Mathematical Statistics and the Bernoulli Society

Back to Top