The Annals of Statistics

Nonparametric ridge estimation

Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, and Larry Wasserman

Full-text: Open access


We study the problem of estimating the ridges of a density function. Ridge estimation is an extension of mode finding and is useful for understanding the structure of a density. It can also be used to find hidden structure in point cloud data. We show that, under mild regularity conditions, the ridges of the kernel density estimator consistently estimate the ridges of the true density. When the data are noisy measurements of a manifold, we show that the ridges are close and topologically similar to the hidden manifold. To find the estimated ridges in practice, we adapt the modified mean-shift algorithm proposed by Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249–1286]. Some numerical experiments verify that the algorithm is accurate.

Article information

Ann. Statist. Volume 42, Number 4 (2014), 1511-1545.

First available in Project Euclid: 7 August 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation 62G20: Asymptotic properties
Secondary: 62H12: Estimation

Ridges density estimation manifold learning


Genovese, Christopher R.; Perone-Pacifico, Marco; Verdinelli, Isabella; Wasserman, Larry. Nonparametric ridge estimation. Ann. Statist. 42 (2014), no. 4, 1511--1545. doi:10.1214/14-AOS1218.

Export citation


  • Aanjaneya, M., Chazal, F., Chen, D., Glisse, M., Guibas, L. and Morozov, D. (2012). Metric graph reconstruction from noisy data. Internat. J. Comput. Geom. Appl. 22 305–325.
  • Adams, H., Atanasov, A. and Carlsson, G. (2011). Morse theory in topological data analysis. Preprint. Available at arXiv:1112.1993.
  • Aragón-Calvo, M. A., Platen, E., van de Weygaert, R. and Szalay, A. S. (2010). The spine of the cosmic web. The Astrophysical Journal 723 364.
  • Arias-Castro, E., Mason, D. and Pelletier, B. (2013). On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. Unpublished manuscript.
  • Bendich, P., Wang, B. and Mukherjee, S. (2012). Local homology transfer and stratification learning. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms 1355–1370. SIAM, New York.
  • Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics 169. Springer, New York.
  • Caillerie, C., Chazal, F., Dedecker, J. and Michel, B. (2011). Deconvolution for the Wasserstein metric and geometric inference. Electron. J. Stat. 5 1394–1423.
  • Carreira-Perpinan, M. and Williams, C. (2003). On the number of modes of a Gaussian mixture. In Scale Space Methods in Computer Vision 625–640. Springer, New York.
  • Chacón, J. E. (2012). Clusters and water flows: A novel approach to modal clustering through morse theory. Preprint. Available at arXiv:1212.1384.
  • Chacón, J. and Duong, T. (2012). Bandwidth selection for multivariate density derivative estimation, with applications to clustering and bump hunting. Preprint. Available at arXiv:1204.6160.
  • Chacón, J. E. and Duong, T. (2010). Multivariate plug-in bandwidth selection with unconstrained pilot bandwidth matrices. TEST 19 375–398.
  • Chacón, J. E., Duong, T. and Wand, M. P. (2011). Asymptotics for general multivariate kernel density derivative estimators. Statist. Sinica 21 807–840.
  • Chazal, F., Cohen-Steiner, D. and Lieutier, A. (2009). A sampling theorem for compact sets in Euclidean space. Discrete Comput. Geom. 41 461–479.
  • Chazal, F. and Lieutier, A. (2005). The $\lambda$-medial axis. Graphical Models 67 304–331.
  • Chazal, F., Oudot, S., Skraba, P. and Guibas, L. J. (2011). Persistence-based clustering in Riemannian manifolds. In Computational Geometry (SCG’11) 97–106. ACM, New York.
  • Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 17 790–799.
  • Cheng, M.-Y., Hall, P. and Hartigan, J. A. (2004). Estimating gradient trees. In A Festschrift for Herman Rubin. Institute of Mathematical Statistics Lecture Notes—Monograph Series 45 237–249. IMS, Beachwood, OH.
  • Comaniciu, D. and Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24 603–619.
  • Davenport, M. A., Hegde, C., Duarte, M. F. and Baraniuk, R. G. (2010). Joint manifolds for data fusion. IEEE Trans. Image Process. 19 2580–2594.
  • Eberly, D. (1996). Ridges in image and data analysis. Kluwer Academic, Boston.
  • Edelsbrunner, H., Fasy, B. T. and Rote, G. (2012). Add isotropic Gaussian kernels at own risk: More and more resilient modes in higher dimensions. In Computational Geometry (SCG’12) 91–100. ACM, New York.
  • Fukunaga, K. and Hostetler, L. D. (1975). The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inform. Theory IT-21 32–40.
  • Ge, X., Safa, I., Belkin, M. and Wang, Y. (2011). Data skeletonization via reeb graphs. In Advances in Neural Information Processing Systems 24 (J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira and K. Q. Weinberger, eds.) 837–845. Curran Associates, New York.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2009). On the path density of a gradient field. Ann. Statist. 37 3236–3271.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012a). The geometry of nonparametric filament estimation. J. Amer. Statist. Assoc. 107 788–799.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012b). Manifold estimation and singular deconvolution under Hausdorff loss. Ann. Statist. 40 941–963.
  • Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2012c). Minimax manifold estimation. J. Mach. Learn. Res. 13 1263–1291.
  • Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Henri Poincaré Probab. Stat. 38 907–921.
  • Hall, P., Peng, L. and Rau, C. (2001). Local likelihood tracking of fault lines and boundaries. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 569–582.
  • Hall, P., Qian, W. and Titterington, D. M. (1992). Ridge finding from noisy data. J. Comput. Graph. Statist. 1 197–211.
  • Horn, R. A. and Johnson, C. R. (2013). Matrix Analysis, 2nd ed. Cambridge Univ. Press, Cambridge.
  • Irwin, M. C. (1980). Smooth Dynamical Systems. Pure and Applied Mathematics 94. Academic Press, New York.
  • Klemelä, J. (2005). Adaptive estimation of the mode of a multivariate density. J. Nonparametr. Stat. 17 83–105.
  • Klemelä, J. (2009). Smoothing of Multivariate Data: Density Estimation and Visualization. Wiley, Hoboken, NJ.
  • Lecci, F., Rinaldo, A. and Wasserman, L. (2013). Statistical analysis of metric graph reconstruction. Preprint. Available at arXiv:1305.1212.
  • Li, J., Ray, S. and Lindsay, B. G. (2007). A nonparametric statistical approach to clustering via mode identification. J. Mach. Learn. Res. 8 1687–1723.
  • Magnus, X. and Neudecker, H. (1988). Matrix Differential Calculus. Wiley, New York.
  • Niyogi, P., Smale, S. and Weinberger, S. (2008). Finding the homology of submanifolds with high confidence from random samples. Discrete Comput. Geom. 39 419–441.
  • Norgard, G. and Bremer, P.-T. (2012). Second derivative ridges are straight lines and the implications for computing lagrangian coherent structures. Phys. D 241 1475–1476.
  • Ozertem, U. and Erdogmus, D. (2011). Locally defined principal curves and surfaces. J. Mach. Learn. Res. 12 1249–1286.
  • Panaretos, V. M. and Konis, K. (2012). Nonparametric construction of multivariate kernels. J. Amer. Statist. Assoc. 107 1085–1095.
  • Peikert, R., Günther, D. and Weinkauf, T. (2012). Comment on “Second derivative ridges are straight lines and the implications for computing lagrangian coherent structures”. Phys. D 242 65–66.
  • Prakasa Rao, B. L. S. (1983). Nonparametric Functional Estimation. Academic Press, New York.
  • Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science 290 2323–2326.
  • Schindler, B., Peikert, R., Fuchs, R. and Theisel, H. (2012). Ridge concepts for the visualization of Lagrangian coherent structures. In Topological Methods in Data Analysis and Visualization II. 221–235. Springer, Heidelberg.
  • Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, New York.
  • Sousbie, T., Pichon, C., Courtois, H., Colombi, S. and Novikov, D. (2008). The three-dimensional skeleton of the SDSS. The Astrophysical Journal Letters 672 L1.
  • Stewart, G. W. and Sun, J. G. (1990). Matrix Perturbation Theory. Academic Press, Boston, MA.
  • Tenenbaum, J. B., de Silva, V. and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science 290 2319–2323.
  • von Luxburg, U. (2007). A tutorial on spectral clustering. Stat. Comput. 17 395–416.
  • Wegman, E., Carr, D. and Luo, Q. (1993). Visualizing multivariate data. In Multivariate Analysis: Future Directions 423–466. Elsevier, Washington, DC.
  • Wegman, E. and Luo, Q. (2002). Smoothings, ridges, and bumps. In Proceedings of the ASA (published on CD). Development of the relationship between geometric aspects of visualizing densities and density approximators, and a discussion of rendering and lighting models, contouring algorithms, stereoscopic display algorithms, and visual design considerations 3666–3672. American Statistical Association.
  • Yukich, J. E. (1985). Laws of large numbers for classes of functions. J. Multivariate Anal. 17 245–260.