Open Access
November 2020 A $k$-points-based distance for robust geometric inference
Claire Brécheteau, Clément Levrard
Bernoulli 26(4): 3017-3050 (November 2020). DOI: 10.3150/20-BEJ1214

Abstract

Analyzing the sub-level sets of the distance to a compact submanifold of $\mathbb{R}^{d}$ is a common method in topological data analysis, to understand its topology. Therefore, topological inference procedures usually rely on a distance estimate based on $n$ sample points (Discrete Comput. Geom. 33 (2005) 249–274). In the case where sample points are corrupted by noise, the distance-to-measure function (DTM, Found. Comput. Math. 11 (2011) 733–751) is a surrogate for the distance-to-compact-set function. In practice, approximating the homology of its sub-level sets requires to compute the homology of unions of $n$ balls (Discrete Comput. Geom. 49 (2013) 22–45; In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms (2015) 168–180 SIAM), that might become intractable whenever $n$ is large. To simultaneously face the two problems of a large number of points and noise, we introduce the $k$-power-distance-to-measure function ($k$-PDTM). This new surrogate for the distance-to-compact is a $k$-points-based approximation of the DTM. These $k$ points are minimizers of a robustified version of the classical $k$-means criterion (In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66) (1967) 281–297 Univ. California Press). The sublevel sets of the $k$-PDTM consist in unions of $k$ balls, and this distance is also proved robust to noise. We assess the quality of this approximation for $k$ possibly drastically smaller than $n$, and provide an algorithm to compute this $k$-PDTM from a sample. Numerical experiments illustrate the good behavior of this $k$-points approximation in a noisy topological inference framework.

Citation

Download Citation

Claire Brécheteau. Clément Levrard. "A $k$-points-based distance for robust geometric inference." Bernoulli 26 (4) 3017 - 3050, November 2020. https://doi.org/10.3150/20-BEJ1214

Information

Received: 1 August 2019; Revised: 1 March 2020; Published: November 2020
First available in Project Euclid: 27 August 2020

zbMATH: 07256167
MathSciNet: MR4140536
Digital Object Identifier: 10.3150/20-BEJ1214

Keywords: Minimax rates , quantization , robust distance estimation , topological inference

Rights: Copyright © 2020 Bernoulli Society for Mathematical Statistics and Probability

Vol.26 • No. 4 • November 2020
Back to Top