The Annals of Statistics

Bump hunting with non-Gaussian kernels

Peter Hall, Michael C. Minnotte, and Chunming Zhang

Full-text: Open access

Abstract

It is well known that the number of modes of a kernel density estimator is monotone nonincreasing in the bandwidth if the kernel is a Gaussian density. There is numerical evidence of nonmonotonicity in the case of some non-Gaussian kernels, but little additional information is available. The present paper provides theoretical and numerical descriptions of the extent to which the number of modes is a nonmonotone function of bandwidth in the case of general compactly supported densities. Our results address popular kernels used in practice, for example, the Epanechnikov, biweight and triweight kernels, and show that in such cases nonmonotonicity is present with strictly positive probability for all sample sizes n3. In the Epanechnikov and biweight cases the probability of nonmonotonicity equals 1 for all n2. Nevertheless, in spite of the prevalence of lack of monotonicity revealed by these results, it is shown that the notion of a critical bandwidth (the smallest bandwidth above which the number of modes is guaranteed to be monotone) is still well defined. Moreover, just as in the Gaussian case, the critical bandwidth is of the same size as the bandwidth that minimises mean squared error of the density estimator. These theoretical results, and new numerical evidence, show that the main effects of nonmonotonicity occur for relatively small bandwidths, and have negligible impact on many aspects of bump hunting.

Article information

Source
Ann. Statist., Volume 32, Number 5 (2004), 2124-2141.

Dates
First available in Project Euclid: 27 October 2004

Permanent link to this document
https://projecteuclid.org/euclid.aos/1098883784

Digital Object Identifier
doi:10.1214/009053604000000715

Mathematical Reviews number (MathSciNet)
MR2102505

Zentralblatt MATH identifier
1056.62049

Subjects
Primary: 62G07: Density estimation
Secondary: 62G20: Asymptotic properties

Keywords
Bandwidth choice bootstrap critical bandwidth density estimation kernel methods modality mode test nonparametric curve estimation unimodality

Citation

Hall, Peter; Minnotte, Michael C.; Zhang, Chunming. Bump hunting with non-Gaussian kernels. Ann. Statist. 32 (2004), no. 5, 2124--2141. doi:10.1214/009053604000000715. https://projecteuclid.org/euclid.aos/1098883784


Export citation

References

  • Chaudhuri, P. and Marron, J. S. (1999). SiZer for exploration of structures in curves. J. Amer. Statist. Assoc. 94 807--823.
  • Chaudhuri, P. and Marron, J. S. (2000). Scale space view of curve estimation. Ann. Statist. 28 408--428.
  • Cheng, M.-Y. and Hall, P. (1999). Mode testing in difficult cases. Ann. Statist. 27 1294--1315.
  • Cuevas, A. and González-Manteiga, W. (1991). Data-driven smoothing based on convexity properties. In Nonparametric Functional Estimation and Related Topics (G. Roussas, ed.) 225--240. Kluwer, Dordrecht.
  • Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577--588.
  • Fisher, N. I., Mammen, E. and Marron, J. S. (1994). Testing for multimodality. Comput. Statist. Data Anal. 18 499--512.
  • Fisher, N. I. and Marron, J. S. (2001). Mode testing via the excess mass estimate. Biometrika 88 499--517.
  • Good, I. J. and Gaskins, R. A. (1980). Density estimation and bump-hunting by the penalized likelihood method exemplified by scattering and meteorite data (with discussion). J. Amer. Statist. Assoc. 75 42--73.
  • Hall, P. and York, M. (2001). On the calibration of Silverman's test for multimodality. Statist. Sinica 11 515--536.
  • Hartigan, J. A. and Hartigan, P. M. (1985). The DIP test of unimodality. Ann. Statist. 13 70--84.
  • Izenman, A. J. and Sommer, C. (1988). Philatelic mixtures and multimodal densities. J. Amer. Statist. Assoc. 83 941--953.
  • Komlós, J., Major, P. and Tusnády, G. (1976). An approximation of partial sums of independent rv's, and the sample df. II. Z. Wahrsch. Verv. Gebiete 34 33--58.
  • Mammen, E., Marron, J. S. and Fisher, N. I. (1992). Some asymptotics for multimodality tests based on kernel density estimates. Probab. Theory Related Fields 91 115--132.
  • Minnotte, M. C. (1997). Nonparametric testing of the existence of modes. Ann. Statist. 25 1646--1660.
  • Minnotte, M. C. and Scott, D. W. (1993). The mode tree: A tool for visualization of nonparametric density estimates. J. Comput. Graph. Statist. 2 51--68.
  • Müller, D. W and Sawitzki, G. (1991). Excess mass estimates and tests for multimodality. J. Amer. Statist. Assoc. 86 738--746.
  • Polonik, W. (1995a). Measuring mass concentrations and estimating density contour clusters---an excess mass approach. Ann. Statist. 23 855--881.
  • Polonik, W. (1995b). Density estimation under qualitative assumptions in higher dimensions. J. Multivariate Anal. 55 61--81.
  • Roeder, K. (1990). Density estimation with confidence sets exemplified by superclusters and voids in the galaxies. J. Amer. Statist. Assoc. 85 617--624.
  • Roeder, K. (1994). A graphical technique for determining the number of components in a mixture of normals. J. Amer. Statist. Assoc. 89 487--495.
  • Schoenberg, I. J. (1950). On Pólya frequency functions. II. Variation-diminishing integral operators of the convolution type. Acta Sci. Math. (Szeged) 12 97--106.
  • Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. J. Roy. Statist. Soc. Ser. B 53 683--690.
  • Silverman, B. W. (1981). Using kernel density estimates to investigate multimodality. J. Roy. Statist. Soc. Ser. B 43 97--99.
  • Silverman, B. W. (1983). Some properties of a test for multimodality based on kernel density estimates. In Probability, Statistics and Analysis (J. F. C. Kingman and G. E. H. Reuter, eds.) 248--259. Cambridge Univ. Press.