Electronic Journal of Statistics

Exact risk improvement of bandwidth selectors for kernel density estimation with directional data

Eduardo García–Portugués

Full-text: Open access

Abstract

New bandwidth selectors for kernel density estimation with directional data are presented in this work. These selectors are based on asymptotic and exact error expressions for the kernel density estimator combined with mixtures of von Mises distributions. The performance of the proposed selectors is investigated in a simulation study and compared with other existing rules for a large variety of directional scenarios, sample sizes and dimensions. The selector based on the exact error expression turns out to have the best behaviour of the studied selectors for almost all the situations. This selector is illustrated with real data for the circular and spherical cases.

Article information

Source
Electron. J. Statist., Volume 7 (2013), 1655-1685.

Dates
First available in Project Euclid: 19 June 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1371649230

Digital Object Identifier
doi:10.1214/13-EJS821

Mathematical Reviews number (MathSciNet)
MR3070874

Zentralblatt MATH identifier
1327.62241

Subjects
Primary: 62G07: Density estimation

Keywords
Bandwidth selection directional data mixtures kernel density estimator von Mises

Citation

García–Portugués, Eduardo. Exact risk improvement of bandwidth selectors for kernel density estimation with directional data. Electron. J. Statist. 7 (2013), 1655--1685. doi:10.1214/13-EJS821. https://projecteuclid.org/euclid.ejs/1371649230


Export citation

References

  • [1] Azzalini, A. (1985). A class of distributions which includes the normal ones., Scand. J. Statist. 12 171–178.
  • [2] Bai, Z. D., Rao, C. R. and Zhao, L. C. (1988). Kernel estimators of density function of directional data., J. Multivariate Anal. 27 24–39.
  • [3] Banerjee, A., Dhillon, I. S., Ghosh, J. and Sra, S. (2005). Clustering on the unit hypersphere using von Mises-Fisher distributions., J. Mach. Learn. Res. 6 1345–1382.
  • [4] Bingham, C. and Mardia, K. V. (1978). A small circle distribution on the sphere., Biometrika 65 379–389.
  • [5] Cabella, P. and Marinucci, D. (2009). Statistical challenges in the analysis of cosmic microwave background radiation., Ann. Appl. Stat. 3 61–95.
  • [6] Cao, R., Cuevas, A. and Gonzalez Manteiga, W. (1994). A comparative study of several smoothing methods in density estimation., Comput. Statist. Data Anal. 17 153–176.
  • [7] Chacón, J. E. and Duong, T. (2013). Data–driven density derivative estimation, with applications to nonparametric clustering and bump hunting., Electron. J. Stat. 7 499-532.
  • [8] Chiu, S.-T. (1996). A comparative review of bandwidth selection for kernel density estimation., Statist. Sinica 6 129–145.
  • [9] Ćwik, J. and Koronacki, J. (1997). A combined adaptive-mixtures/plug-in estimator of multivariate probability densities., Comput. Statist. Data Anal. 26 199–218.
  • [10] Di Marzio, M., Panzera, A. and Taylor, C. C. (2009). Local polynomial regression for circular predictors., Statist. Probab. Lett. 79 2066–2075.
  • [11] Di Marzio, M., Panzera, A. and Taylor, C. C. (2011). Kernel density estimation on the torus., J. Statist. Plann. Inference 141 2156–2173.
  • [12] Durastanti, C., Lan, X. and Marinucci, D. (2013). Needlet–Whittle estimates on the unit sphere., Electron. J. Stat. 7 597-646.
  • [13] Fernández-Durán, J. J. (2004). Circular distributions based on nonnegative trigonometric sums., Biometrics 60 499–503.
  • [14] Fernández-Durán, J. J. (2007). Models for circular-linear and circular-circular data constructed from circular distributions based on nonnegative trigonometric sums., Biometrics 63 579–585.
  • [15] Fernández-Durán, J. J. and Gregorio-Domínguez, M. M. (2010). Maximum likelihood estimation of nonnegative trigonometric sum models using a Newton-like algorithm on manifolds., Electron. J. Stat. 4 1402–1410.
  • [16] García-Portugués, E., Crujeiras, R. M. and González-Manteiga, W. (2012). Exploring wind direction and SO$_2$ concentration by circular–linear density estimation., Stoch. Environ. Res. Risk Assess.
  • [17] García-Portugués, E., Crujeiras, R. M. and González-Manteiga, W. (2012). Kernel density estimation for directional–linear data., arXiv:1208.4811.
  • [18] Hall, P., Watson, G. S. and Cabrera, J. (1987). Kernel density estimation with spherical data., Biometrika 74 751–762.
  • [19] Hornik, K. and Grün, B. (2012). movMF: Mixtures of von Mises-Fisher Distributions R package version, 0.1-0.
  • [20] Horová, I., Koláček, J. and Vopatová, K. (2013). Full bandwidth matrix selectors for gradient kernel density estimate., Comput. Statist. Data Anal. 57 364–376.
  • [21] Jammalamadaka, S. R. and Lund, U. J. (2006). The effect of wind direction on ozone levels: a case study., Environ. Ecol. Stat. 13 287–298.
  • [22] Johnson, M. E. (1987)., Multivariate statistical simulation. Wiley Series in Probability and Mathematical Statistics. Applied Probability and Statistics. John Wiley & Sons Ltd., New York.
  • [23] Jones, C., Marron, J. S. and Sheather, S. J. (1996). Progress in data-based bandwidth selection for kernel density estimation., Computation. Stat. 11 337–381.
  • [24] Jupp, P. E. and Mardia, K. V. (1989). A unified view of the theory of directional statistics, 1975-1988., Int. Stat. Rev. 57 261–294.
  • [25] Klemelä, J. (2000). Estimation of densities and derivatives of densities with directional data., J. Multivariate Anal. 73 18–40.
  • [26] Lebedev, V. I. and Laikov, D. N. (1995). A quadrature formula for the sphere of the 131st algebraic order of accuracy., Dokl. Math. 59 477–481.
  • [27] Mardia, K. V. and Jupp, P. E. (2000)., Directional statistics. Wiley Series in Probability and Statistics. John Wiley & Sons Ltd., Chichester.
  • [28] Marron, J. S. and Wand, M. P. (1992). Exact mean integrated squared error., Ann. Statist. 20 712–736.
  • [29] Oliveira, M., Crujeiras, R. M. and Rodríguez-Casal, A. (2012). A plug-in rule for bandwidth selection in circular density estimation., Comput. Statist. Data Anal. 56 3898–3908.
  • [30] Parzen, E. (1962). On estimation of a probability density function and mode., Ann. Math. Statist. 33 1065–1076.
  • [31] Perryman, M. A. C. et al. (1997)., The Hipparcos and Tycho Catalogues. European Space Agency.
  • [32] Pewsey, A. (2006). Modelling asymmetrically distributed circular data using the wrapped skew-normal distribution., Environ. Ecol. Stat. 13 257–269.
  • [33] Pukkila, T. M. and Rao, C. R. (1988). Pattern recognition based on scale invariant discriminant functions., Inform. Sci. 45 379–389.
  • [34] Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function., Ann. Math. Statist. 27 832–837.
  • [35] Scott, D. W. (1992)., Multivariate density estimation. John Wiley & Sons, New York.
  • [36] Silverman, B. W. (1986)., Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability. Chapman & Hall, London.
  • [37] Taylor, C. C. (2008). Automatic bandwidth selection for circular density estimation., Comput. Statist. Data Anal. 52 3493–3500.
  • [38] Van Leeuwen, F. (2007)., Hipparcos, the new reduction of the raw data. Springer.
  • [39] Wand, M. P. and Jones, M. C. (1995)., Kernel smoothing. Monographs on Statistics and Applied Probability 60. Chapman and Hall Ltd., London.
  • [40] Watson, G. S. (1983)., Statistics on spheres. University of Arkansas Lecture Notes in the Mathematical Sciences, 6. John Wiley & Sons Inc., New York.