The Annals of Statistics

On the path density of a gradient field

Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, and Larry Wasserman

Full-text: Open access


We consider the problem of reliably finding filaments in point clouds. Realistic data sets often have numerous filaments of various sizes and shapes. Statistical techniques exist for finding one (or a few) filaments but these methods do not handle noisy data sets with many filaments. Other methods can be found in the astronomy literature but they do not have rigorous statistical guarantees. We propose the following method. Starting at each data point we construct the steepest ascent path along a kernel density estimator. We locate filaments by finding regions where these paths are highly concentrated. Formally, we define the density of these paths and we construct a consistent estimator of this path density.

Article information

Ann. Statist., Volume 37, Number 6A (2009), 3236-3271.

First available in Project Euclid: 17 August 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G99: None of the above, but in this section 62G07: Density estimation 62G20: Asymptotic properties

Filaments gradient field nonparametric density estimation


Genovese, Christopher R.; Perone-Pacifico, Marco; Verdinelli, Isabella; Wasserman, Larry. On the path density of a gradient field. Ann. Statist. 37 (2009), no. 6A, 3236--3271. doi:10.1214/08-AOS671.

Export citation


  • Arias-Castro, E., Donoho, D. and Huo, X. (2006). Adaptive multiscale detection of filamentary structures in a background of uniform random points. Ann. Statist. 34 326–349.
  • Barrow, J., Bhavsar, S. and Sonoda, D. (1985). Minimal spanning trees, filaments and galaxy clustering. Monthly Notices of the Royal Astronomical Society 216 17–35.
  • Brannen, N. (2001). The sun, the moon, and convexity. College Math. J. 32 268–272.
  • Cheng, Y. (1995). Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Machine Intell. 17 790–799.
  • Croton, D., Springel, V., White, S., De Lucia, G., Frenk, C., Gao, L., Jenkins, A., Kauffman, G., Navarro, J. and Yoshida, N. (2006). The many lives of agn: Cooling flows, black holes and the luminosities and colours of galaxie. Monthly Notices of the Royal Astronomical Society 365 11–28.
  • Cuevas, A. and Fraiman, R. (1997). A plug-in approach to support estimation. Ann. Statist. 25 2300–2312.
  • Eriksen, H., Novikov, D., Lilje, P., Banday, A. and Gorski, K. (2004). Testing for non-Gaussianity in the Wilkinson microwave anisotropy probe data: Minkowski functionals and the length of the skeleton. Astrophys. J. 612 64–80.
  • Fukunaga, K. and Hostetler, L. (1975). The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inform. Theory 21 32–40.
  • Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. H. Poincaré Probab. Statist. 38 907–921.
  • Giné, E. and Koltchinskii, V. (2006). Empirical graph Laplacian approximation of Laplace–Beltrami operators: Large-sample results. In Proceedings of the 4th International Conference on High Dimensional Probability. Institute of Mathematical Statistics Lecture Notes 51 238–259. IMS, Beachwood, OH.
  • Guillemin, V. and Pollack, A. (1974). Differential Topology. Prentice Hall, Englewood Cliffs, NJ.
  • Hastie, T. and Stuetzle, W. (1989). Principal curves. J. Amer. Statist. Assoc. 84 502–516.
  • Irwin, M. C. (2001). Smooth Dynamical Systems. World Sci. Publishing, River Edge, NJ.
  • Kegl, B., Krzyzak, A., Linder, T. and Zeger, K. (2000). Learning and design of principal curves. IEEE Trans. Pattern Analysis Machine Intelligence 22 281–297.
  • Koltchinskii, V. A. (2007). Integral curves of noisy vector fields and statistical problems in diffusion tensor imaging: Nonparametric kernel density estimation and hypothesis testing. Ann. Statist. 35 1576–1607.
  • Lacoste, C., Descombes, X. and Zerubia, J. (2005). Point processes for unsupervised line network extraction in remote sensing. IEEE Trans. Pattern Analysis Machine Intelligence 27 1568–1579.
  • Luo, S. and Vishniac, E. (1995). Three-dimensional shape statistics: Methodology. Astrophys. J. Suppl. Ser. 96 429–460.
  • Martinez, V. and Saar, E. (2002). Statistics of the Galaxy Distribution. Chapman & Hall, Boca Raton, FL.
  • Milnor, J. (1963). Morse Theory. Princeton Univ. Press, Princeton, NJ.
  • Naulin, R. and Pabst, C. (1994). The roots of a polynomial depend continuously on its coefficients. Rev. Colombiana Mat. 28 35–37.
  • Novikov, D., Colombi, S. and Doré, O. (2006). Skeleton as a probe of the cosmic web: Two-dimensional case. Monthly Notices of the Royal Astronomical Society 366 1201–1216.
  • Sandilya, S. and Kulkarni, S. (2002). Principal curves with bounded turn. IEEE Trans. Inform. Theory 48 2789–2793.
  • Sousbie, T., Pichon, C., Courtois, H., Colombi, S. and Novikov, D. (2006). The 3d skeleton of the SDSS. To appear.
  • Stoica, R., Descombes, X. and Zerubia, J. (2004). A gibbs point process for road extraction in remotely sensed images. Int. J. Comput. Visn. 57 121–137.
  • Stoica, R., Martinez, V., Mateu, J. and Saar, E. (2005). Detection of cosmic filaments using the candy model. Astron. Astrophys. 434 423–432.
  • Stoica, R., Martinez, V. and Saar, E. (2007). A three-dimensional object point process for detection of cosmic filaments. J. Roy. Statist. Soc. Ser. C 56 459–477.
  • Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes. Ann. Probab. 22 28–76.
  • Tibshirani, R. (1992). Principal curves revisited. J. Stat. Comput. Simul. 2 183–190.