The Annals of Statistics

Asymptotics and optimal bandwidth selection for highest density region estimation

R. J. Samworth and M. P. Wand

Full-text: Open access

Abstract

We study kernel estimation of highest-density regions (HDR). Our main contributions are two-fold. First, we derive a uniform-in-bandwidth asymptotic approximation to a risk that is appropriate for HDR estimation. This approximation is then used to derive a bandwidth selection rule for HDR estimation possessing attractive asymptotic properties. We also present the results of numerical studies that illustrate the benefits of our theory and methodology.

Article information

Source
Ann. Statist. Volume 38, Number 3 (2010), 1767-1792.

Dates
First available in Project Euclid: 24 March 2010

Permanent link to this document
https://projecteuclid.org/euclid.aos/1269452654

Digital Object Identifier
doi:10.1214/09-AOS766

Mathematical Reviews number (MathSciNet)
MR2662359

Zentralblatt MATH identifier
1189.62061

Subjects
Primary: 62G07: Density estimation 62G20: Asymptotic properties

Keywords
Density contour density level set kernel density estimator plug-in bandwidth selection

Citation

Samworth, R. J.; Wand, M. P. Asymptotics and optimal bandwidth selection for highest density region estimation. Ann. Statist. 38 (2010), no. 3, 1767--1792. doi:10.1214/09-AOS766. https://projecteuclid.org/euclid.aos/1269452654


Export citation

References

  • Baíllo, A. (2003). Total error in a plug-in estimator of level sets. Statist. Probab. Lett. 65 411–417.
  • Baíllo, A., Cuesta-Albertos, J. A. and Cuevas, A. (2001). Convergence rates in nonparametric estimation of level sets. Statist. Probab. Lett. 53 27–35.
  • Bowman, A. W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71 353–360.
  • Burkill, J. C. and Burkill, H. (2002). A Second Course in Mathematical Analysis. Cambridge Univ. Press, Cambridge.
  • Cadre, B. (2006). Kernel estimation of density level sets. J. Multivariate Anal. 97 999–1023.
  • Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics 63. Cambridge Univ. Press, Cambridge.
  • Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. H. Poincaré Probab. Statist. 38 907–921.
  • González-Manteiga, W., Sanchéz-Sellero, C. and Wand, M. P. (1996). Accuracy of binned kernel functional approximations. Comput. Statist. Data Anal. 22 1–16.
  • Hartigan, J. A. (1987). Estimation of a convex density contour in two dimensions. J. Amer. Statist. Assoc. 82 267–270.
  • Hyndman, R. J. (1996). Computing and graphing highest density regions. Amer. Statist. 50 120–126.
  • Hyndman, R. J. (2009). hdrcde 2.12. Highest density regions and conditional density estimation. R package. Available at http://cran.r-project.org.
  • Jang, W. (2006). Nonparametric density estimation and clustering in astronomical sky surveys. Comput. Statist. Data Anal. 50 760–774.
  • Marron, J. S. and Wand, M. P. (1992). Exact mean integrated squared error. Ann. Statist. 20 712–736.
  • Mason, D. M. and Polonik, W. (2009). Asymptotic normality of plug-in level set estimates. Ann. Appl. Probab. 19 1108–1142.
  • Müller, D. W. and Sawitzki, G. (1991). Excess mass estimates and tests for multimodality. J. Amer. Statist. Assoc. 86 738–746.
  • Park, B. U. and Marron, J. S. (1990). Comparison of data-driven bandwidth selectors. J. Amer. Statist. Assoc. 85 66–72.
  • Polonik, W. (1995). Measuring mass concentrations and estimating density contour clusters—an excess mass approach. Ann. Statist. 23 855–881.
  • R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at http://www.R-project.org.
  • Rigollet, P. and Vert, R. (2009). Optimal rates for plug-in estimators of density level sets. Bernoulli 15 1154–1178.
  • Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators. Scand. J. Statist. 9 65–78.
  • Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. J. Roy. Statist. Soc. Ser. B 53 683–690.
  • Tsybakov, A. B. (1997). On nonparametric estimation of density level sets. Ann. Statist. 25 948–969.
  • Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Monographs on Statistics and Applied Probability 60. Chapman and Hall, London.