The Annals of Statistics

Asymptotic theory for density ridges

Yen-Chi Chen, Christopher R. Genovese, and Larry Wasserman

Full-text: Access denied (no subscription detected) We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

The large sample theory of estimators for density modes is well understood. In this paper we consider density ridges, which are a higher-dimensional extension of modes. Modes correspond to zero-dimensional, local high-density regions in point clouds. Density ridges correspond to $s$-dimensional, local high-density regions in point clouds. We establish three main results. First we show that under appropriate regularity conditions, the local variation of the estimated ridge can be approximated by an empirical process. Second, we show that the distribution of the estimated ridge converges to a Gaussian process. Third, we establish that the bootstrap leads to valid confidence sets for density ridges.

Article information

Source
Ann. Statist. Volume 43, Number 5 (2015), 1896-1928.

Dates
Received: June 2014
Revised: March 2015
First available in Project Euclid: 3 August 2015

Permanent link to this document
http://projecteuclid.org/euclid.aos/1438606848

Digital Object Identifier
doi:10.1214/15-AOS1329

Mathematical Reviews number (MathSciNet)
MR3375871

Zentralblatt MATH identifier
1327.62303

Subjects
Primary: 62G20: Asymptotic properties
Secondary: 62G15: Tolerance and confidence regions 62G05: Estimation

Keywords
Ridge density estimation nonparametric statistics empirical process bootstrap

Citation

Chen, Yen-Chi; Genovese, Christopher R.; Wasserman, Larry. Asymptotic theory for density ridges. Ann. Statist. 43 (2015), no. 5, 1896--1928. doi:10.1214/15-AOS1329. http://projecteuclid.org/euclid.aos/1438606848.


Export citation

References

  • Aanjaneya, M., Chazal, F., Chen, D., Glisse, M., Guibas, L. and Morozov, D. (2012). Metric graph reconstruction from noisy data. Internat. J. Comput. Geom. Appl. 22 305–325.
  • Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L. J. and Oudot, S. Y. (2009). Proximity of persistence modules and their diagrams. In Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry 237–246. ACM, New York.
  • Chazal, F., de Silva, V., Glisse, M. and Oudot, S. (2012). The structure and stability of persistence modules. Preprint. Available at arXiv:1207.3674.
  • Chen, Y.-C., Genovese, C. R. and Wasserman, L. (2015). Supplement to “Asymptotic theory for density ridges.” DOI:10.1214/15-AOS1329SUPP.
  • Cheng, M.-Y., Hall, P. and Hartigan, J. A. (2004). Estimating gradient trees. In A Festschrift for Herman Rubin. Institute of Mathematical Statistics Lecture Notes—Monograph Series 45 237–249. IMS, Beachwood, OH.
  • Cheng, S.-W., Funke, S., Golin, M., Kumar, P., Poon, S.-H. and Ramos, E. (2005). Curve reconstruction from noisy samples. Comput. Geom. 31 63–100.
  • Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Available at arXiv:1301.4807.
  • Chernozhukov, V., Chetverikov, D. and Kato, K. (2014a). Anti-concentration and honest, adaptive confidence bands. Ann. Statist. 42 1787–1818.
  • Chernozhukov, V., Chetverikov, D. and Kato, K. (2014b). Gaussian approximation of suprema of empirical processes. Ann. Statist. 42 1564–1597.
  • Cohen-Steiner, D., Edelsbrunner, H. and Harer, J. (2007). Stability of persistence diagrams. Discrete Comput. Geom. 37 103–120.
  • Cuevas, A., González-Manteiga, W. and Rodríguez-Casal, A. (2006). Plug-in estimation of general level sets. Aust. N. Z. J. Stat. 48 7–19.
  • Damon, J. (1999). Properties of ridges and cores for two-dimensional images. J. Math. Imaging Vision 10 163–174.
  • Eberly, D. (1996). Ridges in Image and Data Analysis. Springer, Dordrecht.
  • Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1–26.
  • Einmahl, U. and Mason, D. M. (2000). An empirical process approach to the uniform consistency of kernel-type function estimators. J. Theoret. Probab. 13 1–37.
  • Einmahl, U. and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. Ann. Statist. 33 1380–1403.
  • Federer, H. (1959). Curvature measures. Trans. Amer. Math. Soc. 93 418–491.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012a). The geometry of nonparametric filament estimation. J. Amer. Statist. Assoc. 107 788–799.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012b). Manifold estimation and singular deconvolution under Hausdorff loss. Ann. Statist. 40 941–963.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012c). Minimax manifold estimation. J. Mach. Learn. Res. 13 1263–1291.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2014). Nonparametric ridge estimation. Ann. Statist. 42 1511–1545.
  • Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Henri Poincaré Probab. Stat. 38 907–921.
  • Hall, P. and Peng, L. (2001). Local likelihood tracking of fault lines and boundaries. J. R. Stat. Soc. Ser. B. Stat. Methodol. 63 569–582.
  • Hall, P., Qian, W. and Titterington, D. M. (1992). Ridge finding from noisy data. J. Comput. Graph. Statist. 1 197–211.
  • Klemelä, J. (2004). Visualization of multivariate density estimates with level set trees. J. Comput. Graph. Statist. 13 599–620.
  • Lecci, F., Rinaldo, A. and Wasserman, L. (2014). Statistical analysis of metric graph reconstruction. J. Mach. Learn. Res. 15 3425–3446.
  • Lee, I.-K. (2000). Curve reconstruction from unorganized points. Comput. Aided Geom. Design 17 161–177.
  • Li, J., Ray, S. and Lindsay, B. G. (2007). A nonparametric statistical approach to clustering via mode identification. J. Mach. Learn. Res. 8 1687–1723.
  • Lu, X. Y. and Slepčev, D. (2013). Properties of minimizers of average-distance problem via discrete approximation of measures. SIAM J. Math. Anal. 45 3114–3131.
  • Molchanov, I. (2005). Theory of Random Sets. Springer London, London.
  • Ozertem, U. and Erdogmus, D. (2011). Locally defined principal curves and surfaces. J. Mach. Learn. Res. 12 1249–1286.
  • Qiao, W. and Polonik, W. (2014). Theoretical analysis of nonparametric filament estimation. Available at http://anson.ucdavis.edu/~polonik/FilamentPaper-final-2.pdf.
  • Ray, S. and Lindsay, B. G. (2005). The topography of multivariate normal mixtures. Ann. Statist. 33 2042–2065.
  • Rinaldo, A. and Wasserman, L. (2010). Generalized density clustering. Ann. Statist. 38 2678–2722.
  • Romano, J. P. (1988a). Bootstrapping the mode. Ann. Inst. Statist. Math. 40 565–586.
  • Romano, J. P. (1988b). On weak convergence and optimality of kernel density estimates of the mode. Ann. Statist. 16 629–647.
  • Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.
  • Talagrand, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505–563.
  • Tsybakov, A. B. (1997). On nonparametric estimation of density level sets. Ann. Statist. 25 948–969.
  • Walther, G. (1997). Granulometric smoothing. Ann. Statist. 25 2273–2299.
  • Wegman, E. J. and Luo, Q. (2002). Smothings, Ridges, and Bumps. In Proceedings of the Joint Statistical Meetings—Section on Nonparametric Statistics 3666–3672. Amer. Statist. Assoc., Alexandria, VA.
  • Zaliapin, I. and Kovchegov, Y. (2012). Tokunaga and Horton self-similarity for level set trees of Markov chains. Chaos Solitons Fractals 45 358–372.

Supplemental materials

  • Supplementary proofs: Asymptotic theory for density ridges. The supplementary material contains proofs of Lemmas 1, 2, 4, 9, 13, 14, 17.