Electronic Journal of Statistics

Local bandwidth selection via second derivative segmentation

Alexander Aue, Thomas C. M. Lee, and Haonan Wang

Full-text: Open access

Abstract

This paper studies the problem of local bandwidth selection for local linear regression. It is known that the optimal local bandwidth for estimating the unknown curve f at design point x depends on the curve’s second derivative f''(x) at x. Therefore one could select the local bandwidth h(x) at x via estimating f''(x). However, as typically estimating f''(x) is a much harder task than estimating f(x) itself, this approach for choosing h(x) tends to produce less accurate results. This paper proposes a method for choosing h(x) that bypasses the estimation of f''(x), yet at the same time utilizes the useful fact that the optimal local bandwidth depends on f''(x). The main idea is to first partition the domain of f(x) into different segments for which the second derivative of each segment is approximately constant. The number and the length of the segments are assumed unknown and will be estimated. Then, after such a partition is obtained, any reliable, well-studied global bandwidth selection method can be applied to choose the bandwidth for each segment. The empirical performance of the proposed local bandwidth selection method is evaluated by numerical experiments.

Article information

Source
Electron. J. Statist., Volume 6 (2012), 478-500.

Dates
First available in Project Euclid: 30 March 2012

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1333113099

Digital Object Identifier
doi:10.1214/12-EJS682

Mathematical Reviews number (MathSciNet)
MR2988416

Zentralblatt MATH identifier
1274.62274

Subjects
Primary: 62G08: Nonparametric regression

Keywords
Bandwidth function break point detection local linear regression optimal bandwidth

Citation

Aue, Alexander; Lee, Thomas C. M.; Wang, Haonan. Local bandwidth selection via second derivative segmentation. Electron. J. Statist. 6 (2012), 478--500. doi:10.1214/12-EJS682. https://projecteuclid.org/euclid.ejs/1333113099


Export citation

References

  • Anderson, T. W. (1971)., The statistical analysis of time series. Wiley: New York.
  • Anderson, T. W. and Takemura, A. (1986). Why do noninvertible estimated moving averages occur?, Journal of Time Series Analysis 7 235-254.
  • Aue, A. and Lee, T. C. M. (2011). On image segmentation using information theoretic criteria., The Annals of Statistics 39 2912-2935.
  • Böttcher, A. and Grudsky, S. M. (2005)., Spectral properties of banded Toeplitz matrices. Society for Industrial Mathematics: Philadelphia.
  • Doksum, K., Peterson, D. and Samarov, A. (2000). On variable bandwidth selection in local polynomial regression., Journal of the Royal Statistical Society Series B 62 431-448.
  • Fan, J. and Gijbels, I. (1992). Variable bandwidth and local linear regression smoothers., The Annals of Statistics 20 2008-2036.
  • Fan, J. and Gijbels, I. (1995). Data–driven bandwidth selection in local polynomial fitting: variable bandwidth and spatial adaptation., Journal of the Royal Statistical Society Series B 57 371-394.
  • Fan, J. and Gijbels, I. (1996)., Local Polynomial Modelling and Its Applications. Chapman and Hall, London.
  • Friedman, J. H. (1991). Multivariate Adaptive Regression Splines (with discussion)., The Annals of Statistics 19 1-141.
  • Gijbels, I. and Mammen, E. (1998). Local adaptivity of kernel estimates with plug-in local bandwidth selectors., Scandinavian Journal of Statistics 25 503-520.
  • Gluhovsky, I. and Gluhovsky, A. (2007). Smooth location-dependent bandwidth selection for local polynomial regression., Journal of the American Statistical Association 102 718-725.
  • Hall, P., Marron, J. S. and Titterington, D. M. (1995). On partial local smoothing rules for curve estimation., Biometrika 82 575-587.
  • Herrmann, E. (1997). Local Bandwidth Choice in Kernel Regression Estimation., Journal of Computational and Graphical Statistics 6 35-54.
  • Horváth, L. and Serbinowska, M. (1995). Testing for changes in multinomial observations: the Lindisfarne problem., Scandinavian Journal of Statistics 22 371-384.
  • Hurvich, C. M., Simonoff, J. S. and Tsai, C.-L. (1998). Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion., Journal of the Royal Statistical Society Series B 60 271-293.
  • Lee, T. C. M. (2002). On Algorithms for Ordinary Least Squares Regression Spline Fitting: A Comparative Study., Journal of Statistical Computation and Simulation 72 647-663.
  • Marron, J. S. and Wand, M. P. (1992). Exact Mean Integrated Squared Error., The Annals of Statistics 20 712-736.
  • Rissanen, J. (1989)., Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore.
  • Rissanen, J. (2007)., Information and Complexity in Statistical Modeling. Springer.
  • Ruppert, D. (1997). Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation., Journal of the American Statistical Association 92 1049-1062.
  • Ruppert, D., Wand, M. P. and Carroll, R. J. (2003)., Semiparametric Regression. Cambridge University Press.
  • Yao, Y. C. (1988). Estimating the number of change-points via Schwarz’ criterion., Statistics and Probability Letters 6 181-189.