The Annals of Statistics

Confidence sets for persistence diagrams

Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, and Aarti Singh

Full-text: Open access

Abstract

Persistent homology is a method for probing topological properties of point clouds and functions. The method involves tracking the birth and death of topological features (2000) as one varies a tuning parameter. Features with short lifetimes are informally considered to be “topological noise,” and those with a long lifetime are considered to be “topological signal.” In this paper, we bring some statistical ideas to persistent homology. In particular, we derive confidence sets that allow us to separate topological signal from topological noise.

Article information

Source
Ann. Statist., Volume 42, Number 6 (2014), 2301-2339.

Dates
First available in Project Euclid: 20 October 2014

Permanent link to this document
https://projecteuclid.org/euclid.aos/1413810729

Digital Object Identifier
doi:10.1214/14-AOS1252

Mathematical Reviews number (MathSciNet)
MR3269981

Zentralblatt MATH identifier
1310.62059

Subjects
Primary: 62G05: Estimation 62G20: Asymptotic properties
Secondary: 62H12: Estimation

Keywords
Persistent homology topology density estimation

Citation

Fasy, Brittany Terese; Lecci, Fabrizio; Rinaldo, Alessandro; Wasserman, Larry; Balakrishnan, Sivaraman; Singh, Aarti. Confidence sets for persistence diagrams. Ann. Statist. 42 (2014), no. 6, 2301--2339. doi:10.1214/14-AOS1252. https://projecteuclid.org/euclid.aos/1413810729


Export citation

References

  • Ambrosio, L., Fusco, N. and Pallara, D. (2000). Functions of Bounded Variation and Free Discontinuity Problems. Oxford Mathematical Monographs. Clarendon Press, New York.
  • Balakrishnan, S., Rinaldo, A., Sheehy, D., Singh, A. and Wasserman, L. (2011). Minimax rates for homology inference. Preprint. Available at arXiv:1112.5627.
  • Bendich, P., Galkovskyi, T. and Harer, J. (2011). Improving homology estimates with random walks. Inverse Problems 27 124002.
  • Bendich, P., Mukherjee, S. and Wang, B. (2010). Towards stratification learning through homology inference. Preprint. Available at arXiv:1008.3572.
  • Blumberg, A. J., Gal, I., Mandell, M. A. and Pancia, M. (2012). Persistent homology for metric measure spaces, and robust statistics for hypothesis testing and confidence intervals. Preprint. Available at arXiv:1206.4581.
  • Bubenik, P. and Kim, P. T. (2007). A statistical approach to persistent homology. Homology, Homotopy Appl. 9 337–362.
  • Bubenik, P., Carlsson, G., Kim, P. T. and Luo, Z.-M. (2010). Statistical topology via Morse theory persistence and nonparametric estimation. In Algebraic Methods in Statistics and Probability II. Contemp. Math. 516 75–92. Amer. Math. Soc., Providence, RI.
  • Carlsson, G. (2009). Topology and data. Bull. Amer. Math. Soc. (N.S.) 46 255–308.
  • Carlsson, G. and Zomorodian, A. (2009). The theory of multidimensional persistence. Discrete Comput. Geom. 42 71–93.
  • Chazal, F. (2013). An upper bound for the volume of geodesic balls in submanifolds of Euclidean spaces. Technical report, INRIA.
  • Chazal, F. and Oudot, S. Y. (2008). Towards persistence-based reconstruction in Euclidean spaces. In Computational Geometry (SCG’08) 232–241. ACM, New York.
  • Chazal, F., Cohen-Steiner, D., Mérigot, Q. et al. (2010). Geometric inference for measures based on distance functions. INRIA RR–6930.
  • Chazal, F., Oudot, S., Skraba, P. and Guibas, L. J. (2011). Persistence-based clustering in Riemannian manifolds. In Computational Geometry (SCG’11) 97–106. ACM, New York.
  • Chazal, F., de Silva, V., Glisse, M. and Oudot, S. (2012). The structure and stability of persistence modules. Preprint. Available at arXiv:1207.3674.
  • Chazal, F., Fasy, B. T., Lecci, F., Rinaldo, A., Singh, A. and Wasserman, L. (2013a). On the bootstrap for persistence diagrams and landscapes. Preprint. Available at arXiv:1311.0376.
  • Chazal, F., Glisse, M., Labruère, C. and Michel, B. (2013b). Optimal rates of convergence for persistence diagrams in topological data analysis. Preprint. Available at arXiv:1305.6239.
  • Cohen-Steiner, D., Edelsbrunner, H. and Harer, J. (2007). Stability of persistence diagrams. Discrete Comput. Geom. 37 103–120.
  • Cohen-Steiner, D., Edelsbrunner, H., Harer, J. and Mileyko, Y. (2010). Lipschitz functions have $L_p$-stable persistence. Found. Comput. Math. 10 127–139.
  • Cuevas, A. (2009). Set estimation: Another bridge between statistics and geometry. Bol. Estad. Investig. Oper. 25 71–85.
  • Cuevas, A., Febrero, M. and Fraiman, R. (2001). Cluster analysis: A further approach based on density estimation. Comput. Statist. Data Anal. 36 441–459.
  • Cuevas, A. and Fraiman, R. (1997). A plug-in approach to support estimation. Ann. Statist. 25 2300–2312.
  • Cuevas, A. and Fraiman, R. (1998). On visual distances in density estimation: The Hausdorff choice. Statist. Probab. Lett. 40 333–341.
  • Cuevas, A., Fraiman, R. and Pateiro-López, B. (2012). On statistical properties of sets fulfilling rolling-type conditions. Adv. in Appl. Probab. 44 311–329.
  • Cuevas, A. and Rodríguez-Casal, A. (2004). On boundary estimation. Adv. in Appl. Probab. 36 340–354.
  • Devroye, L. and Wise, G. L. (1980). Detection of abnormal behavior via nonparametric estimation of the support. SIAM J. Appl. Math. 38 480–488.
  • Edelsbrunner, H. and Harer, J. (2008). Persistent homology—a survey. In Surveys on Discrete and Computational Geometry. Contemp. Math. 453 257–282. Amer. Math. Soc., Providence, RI.
  • Edelsbrunner, H. and Harer, J. L. (2010). Computational Topology: An Introduction. Amer. Math. Soc., Providence, RI.
  • Fasy, B. T., Lecci, F., Rinaldo, A., Wasserman, L., Balakrishnan, S. and Singh, A. (2014). Supplement to “Confidence sets for persistence diagrams.” DOI:10.1214/14-AOS1252SUPP.
  • Federer, H. (1959). Curvature measures. Trans. Amer. Math. Soc. 93 418–491.
  • Ghrist, R. (2008). Barcodes: The persistent topology of data. Bull. Amer. Math. Soc. (N.S.) 45 61–75.
  • Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Henri Poincaré Probab. Stat. 38 907–921.
  • Hatcher, A. (2002). Algebraic Topology. Cambridge Univ. Press, Cambridge.
  • Heo, G., Gamble, J. and Kim, P. T. (2012). Topological analysis of variance and the maxillary complex. J. Amer. Statist. Assoc. 107 477–492.
  • Joshi, S., Kommaraju, R. V., Phillips, J. M. and Venkatasubramanian, S. (2011). Comparing distributions and shapes using the kernel distance. In Computational Geometry (SCG’11) 47–56. ACM, New York.
  • Kahle, M. (2009). Topology of random clique complexes. Discrete Math. 309 1658–1671.
  • Kahle, M. (2011). Random geometric complexes. Discrete Comput. Geom. 45 553–573.
  • Kahle, M. and Meckes, E. (2013). Limit theorems for Betti numbers of random simplicial complexes. Homology, Homotopy Appl. 15 343–374.
  • Lambert, J. H. (1758). Observationes variae in mathesin puram. Acta Helveticae Physico-mathematico-anatomico-botanico-medica III 128–168.
  • Mammen, E. and Tsybakov, A. B. (1995). Asymptotical minimax recovery of sets with smooth boundaries. Ann. Statist. 23 502–524.
  • Mattila, P. (1995). Geometry of Sets and Measures in Euclidean Spaces: Fractals and Rectifiability. Cambridge Studies in Advanced Mathematics 44. Cambridge Univ. Press, Cambridge.
  • Mileyko, Y., Mukherjee, S. and Harer, J. (2011). Probability measures on the space of persistence diagrams. Inverse Problems 27 124007, 22.
  • Molchanov, I. S. (1998). A limit theorem for solutions of inequalities. Scand. J. Stat. 25 235–242.
  • Neumann, M. H. (1998). Strong approximation of density estimators from weakly dependent observations by density estimators from independent observations. Ann. Statist. 26 2014–2048.
  • Niyogi, P., Smale, S. and Weinberger, S. (2008). Finding the homology of submanifolds with high confidence from random samples. Discrete Comput. Geom. 39 419–441.
  • Niyogi, P., Smale, S. and Weinberger, S. (2011). A topological view of unsupervised learning from noisy data. SIAM J. Comput. 40 646–663.
  • Penrose, M. (2003). Random Geometric Graphs. Oxford Studies in Probability 5. Oxford Univ. Press, Oxford.
  • Politis, D. N., Romano, J. P. and Wolf, M. (1999). Subsampling. Springer, New York.
  • Romano, J. P. and Shaikh, A. M. (2012). On the uniform asymptotic validity of subsampling and the bootstrap. Ann. Statist. 40 2798–2822.
  • Turner, K., Mileyko, Y., Mukherjee, S. and Harer, J. (2014). Fréchet Means for Distributions of Persistence Diagrams. Discrete Comput. Geom. 52 44–70.

Supplemental materials

  • Supplementary material: Supplement to “Confidence sets for persistence diagrams”. In the supplementary material we give a brief introduction to persistence homology and provide additional details about homology, simplicial complexes and stability of persistence diagrams.