Recently, Chazal, Cohen-Steiner and Mérigot have defined a distance function to measures to answer geometric inference problems in a probabilistic setting. According to their result, the topological properties of a shape can be recovered by using the distance to a known measure ν, if ν is close enough to a measure μ concentrated on this shape. Here, close enough means that the Wasserstein distance W2 between μ and ν is sufficiently small. Given a point cloud, a natural candidate for ν is the empirical measure μn. Nevertheless, in many situations the data points are not located on the geometric shape but in the neighborhood of it, and μn can be too far from μ. In a deconvolution framework, we consider a slight modification of the classical kernel deconvolution estimator, and we give a consistency result and rates of convergence for this estimator. Some simulated experiments illustrate the deconvolution method and its application to geometric inference on various shapes and with various noise distributions.
Bergström, H. (1952). On some expansions of stable distribution functions., Ark. Mat. 2 375–378.
Mathematical Reviews (MathSciNet): MR65053
Biau, G., Cadre, B. and Pelletier, B. (2008). Exact rates in density support estimation., J. Multivariate Anal. 99 2185–2207.
Butucea, C. and Matias, C. (2005). Minimax estimation of the noise level and of the deconvolution density in a semiparametric convolution model., Bernoulli 11 309–340.
Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density., J. Amer. Statist. Assoc. 83 1184–1186.
Mathematical Reviews (MathSciNet): MR997599
Chazal, F., Cohen-Steiner, D. and Lieutier, A. (2009). A Sampling Theory for Compact Sets in Euclidean Spaces., Discrete Comput Geom 41 461-479.
Chazal, F., Cohen-Steiner, D. and Mérigot, Q. Geometric inference for probability measures., J. Foundations of Computational Mathematics. to appear.
Chazal, F. and Lieutier, A. (2008). Smooth Manifold Reconstruction from Noisy and Non Uniform Approximation with Guarantees., Comp. Geom: Theory and Applications 40 156-170.
Comte, F. and Lacour, C. (2011). Data driven density estimation in presence of unknown convolution operator., J. Royal Stat. Soc., Ser B 73 601–627.
Cuevas, A., Febrero, M. and Fraiman, R. (2000). Estimating the number of clusters., Canad. J. Statist. 28 367–382.
Cuevas, A., Fraiman, R. and Rodríguez-Casal, A. (2007). A nonparametric approach to the estimation of lengths and surface areas., Ann. Statist. 35 1031–1051.
Cuevas, A. and Fraiman, R. (2010). Set estimation. In, New perspectives in stochastic geometry 374–397. Oxford Univ. Press, Oxford.
Delaigle, A. and Gijbels, I. (2006). Estimation of boundary and discontinuity points in deconvolution problems., Statist. Sinica 16 773–788.
Delaigle, A. and Hall, P. (2006). On optimal kernel choice for deconvolution., Statist. Probab. Lett. 76 1594–1602.
Devroye, L. (1989). Consistent deconvolution in density estimation., Canad. J. Statist. 17 235–239.
Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2009). On the path density of a gradient field., Ann. Statist. 37 3236–3271.
Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2010). The Geometry of Nonparametric Filament Estimation., arXiv:1003.5536v2.
Hall, P. and Simar, L. (2002). Estimating a changepoint, boundary, or frontier in the presence of observation error., J. Amer. Statist. Assoc. 97 523–534.
Hartigan, J. A. (1975)., Clustering algorithms. John Wiley & Sons, New York-London-Sydney Wiley Series in Probability and Mathematical Statistics.
Mathematical Reviews (MathSciNet): MR405726
Hastie, T. and Stuetzle, W. (1989). Principal curves., J. Amer. Statist. Assoc. 84 502–516.
Horowitz, J. and Karandikar, R. L. (1994). Mean rates of convergence of empirical measures in the Wasserstein metric., J. Comput. Appl. Math. 55 261–273.
Koldobsky, A. (2005)., Fourier analysis in convex geometry. Mathematical Surveys and Monographs 116. American Mathematical Society, Providence, RI.
Koltchinskii, V. I. (2000). Empirical geometry of multivariate data: a deconvolution approach., Ann. Statist. 28 591–629.
Li, T. and Vuong, Q. (1998). Nonparametric estimation of the measurement error model using multiple indicators., J. Multivariate Anal. 65 139–165.
Meister, A. (2004). On the effect of misspecifying the error density in a deconvolution problem., Canad. J. Statist. 32 439–449.
Meister, A. (2006a). Support estimation via moment estimation in presence of noise., Statistics 40 259–275.
Meister, A. (2006b). Estimating the support of multivariate densities under measurement error., J. Multivariate Anal. 97 1702–1717.
Meister, A. (2007). Deconvolving compactly supported densities., Math. Methods Statist. 16 63–76.
Meister, A. (2009)., Deconvolution problems in nonparametric statistics. Lecture Notes in Statistics 193. Springer-Verlag.
Neumann, M. H. (1997). On the effect of estimating the error density in nonparametric deconvolution., J. Nonparametr. Statist. 7 307–330.
Niyogi, P., Smale, S. and Weinberger, S. (2011). A Topological View of Unsupervised Learning from Noisy Data., SIAM Journal on Computing 40 646-663.
Petrunin, A. (2007). Semiconcave functions in Alexandrov’s geometry. In, Surveys in differential geometry. Vol. XI 137–201. Int. Press, Somerville, MA.
Rachev, S. T. (1991)., Probability metrics and the stability of stochastic models. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons Ltd., Chichester.
Rachev, S. T. and Rüschendorf, L. (1998)., Mass transportation problems. Vol. II. Probability and its Applications. Springer-Verlag.
Schwarz, M. and Van Bellegem, S. (2010). Consistent density deconvolution under partially known error distribution., Statist. Probab. Lett. 80 236–241.
Stefanski, L. and Carroll, R. J. (1990). Deconvoluting kernel density estimators., Statistics 21 169–184.
Villani, C. (2008)., Optimal Transport: Old and New. Grundlehren Der Mathematischen Wissenschaften. Springer-Verlag.
Zolotarev, V. M. (1978). Pseudomoments., Teor. Verojatnost. i Primenen. 23 284–294.
Mathematical Reviews (MathSciNet): MR517340