Open Access
2020 Modal clustering asymptotics with applications to bandwidth selection
Alessandro Casa, José E. Chacón, Giovanna Menardi
Electron. J. Statist. 14(1): 835-856 (2020). DOI: 10.1214/20-EJS1679

Abstract

Density-based clustering relies on the idea of linking groups to some specific features of the probability distribution underlying the data. The reference to a true, yet unknown, population structure allows framing the clustering problem in a standard inferential setting, where the concept of ideal population clustering is defined as the partition induced by the true density function. The nonparametric formulation of this approach, known as modal clustering, draws a correspondence between the groups and the domains of attraction of the density modes. Operationally, a nonparametric density estimate is required and a proper selection of the amount of smoothing, governing the shape of the density and hence possibly the modal structure, is crucial to identify the final partition. In this work, we address the issue of density estimation for modal clustering from an asymptotic perspective. A natural and easy to interpret metric to measure the distance between density-based partitions is discussed, its asymptotic approximation explored, and employed to study the problem of bandwidth selection for nonparametric modal clustering.

Citation

Download Citation

Alessandro Casa. José E. Chacón. Giovanna Menardi. "Modal clustering asymptotics with applications to bandwidth selection." Electron. J. Statist. 14 (1) 835 - 856, 2020. https://doi.org/10.1214/20-EJS1679

Information

Received: 1 January 2019; Published: 2020
First available in Project Euclid: 8 February 2020

zbMATH: 07200218
MathSciNet: MR4062160
Digital Object Identifier: 10.1214/20-EJS1679

Subjects:
Primary: 62G20 , 62H30
Secondary: 62G07

Keywords: gradient bandwidth , Kernel estimator , mean shift clustering , nonparametric clustering , plug-in bandwidth

Vol.14 • No. 1 • 2020
Back to Top