The Annals of Statistics

Data sharpening methods for bias reduction in nonparametric regression

Edwin Choi, Peter Hall, and Valentin Rousson

Full-text: Open access


We consider methods for kernel regression when the explanatory and/or response variables are adjusted prior to substitution into a conven- tional estimator.This “data-sharpening” procedure is designed to preserve the advantages of relatively simple, low-order techniques, for example, their robustness against design sparsity problems, yet attain the sorts of bias reductions that are commonly associated only with high-order methods.We consider Nadaraya–Watson and local-linear methods in detail, although data sharpening is applicable more widely. One approach in particular is found to give excellent performance. It involves adjusting both the explanatory and the response variables prior to substitution into a local linear estimator. The change to the explanatory variables enhances resistance of the estimator to design sparsity, by increasing the density of design points in places where the original density had been low. When combined with adjustment of the response variables, it produces a reduction in bias by an order of magnitude. Moreover, these advantages are available in multivariate settings. The data-sharpening step is simple to implement, since it is explicitly defined. It does not involve functional inversion, solution of equations or use of pilot bandwidths.

Article information

Ann. Statist., Volume 28, Number 5 (2000), 1339-1355.

First available in Project Euclid: 12 March 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G07: Density estimation
Secondary: 62H05: Characterization and structure theory

Bandwidth curse of dimensionality design sparsity explanatory variables kernel methods local-linear estimator local-polynomial methods Nadaraya-Watson estimator response variables smoothing


Choi, Edwin; Hall, Peter; Rousson, Valentin. Data sharpening methods for bias reduction in nonparametric regression. Ann. Statist. 28 (2000), no. 5, 1339--1355. doi:10.1214/aos/1015957396.

Export citation


  • Boswell, S. B. (1983). Nonparametric mode estimation for higher dimensional densities. Ph.D. dissertation, Dept. Statistics, Rice University.
  • Brinkman, N. D. (1981). Ethanol fuel: a single-cylinder engine study of efficiency and exhaust emissions. SAE Transactions 90 1410-1424.
  • Choi, E and Hall, P. (1999). Data sharpening as a prelude to density estimation Biometrika. 86 941-947.
  • Fan, J. (1993). Local linear regression smoothers and their minimax efficiencies. Ann. Statist. 21 196-216.
  • Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman and Hall, London.
  • Fwu, C., Tapia, R. A. and Thompson, J. R. (1981). The nonparametric estimation of probability densities in ballistics research. Proceedings Twenty-Sixth Conference Design of Experiments in Army Research Development and Testing 309-326. Springfield, Virginia.
  • Hall, P. and Presnell, B. (1999). Intentionally biased bootstrap methods. J. Roy. Statist. Soc. Ser. B 61 143-158.
  • Hougaard, P. (1988). A boundary modification of kernel function smoothing, with application to insulin absorption kinetics. In Compstat Lectures 31-36. Physica, Vienna.
  • Hougaard, P. Plum, A. and Ribel, U. (1989). Kernel function smoothing of insulin absorption kinetics. Biometrics 45 1041-1052.
  • Jones, M. C., Linton, O. and Nielsen, J. P. (1995). A simple bias reduction method for density estimation Biometrika 82 327-338.
  • Jones, R. H. and Stewart, R. C. (1997). A method for determining significant structures in a cloud of earthquakes. J. Geophysical Res. 102 8245-8254.
  • Linton, O. and Nielsen, J. P. (1994). A multiplicative bias reduction method for nonparametric regression. Statist. Probab. Lett. 19 181-187.
  • Mammen, E. and Marron, J. S. (1997). Mass centred kernel smoothers. Biometrika 84 765-777.
  • M ¨uller, H.-G. (1997). Density adjusted kernel smoothers for random design nonparametric regression. Statist. Probab. Lett. 36 161-172.
  • M ¨uller, H.-G. and Song, K.-S (1993). Identity reproducing multivariate nonparametric regression. J. Multivariate Anal. 46 237-253.
  • Ruppert, D. and Wand, M. P. (1994). Multivariate locally weighted least squares regression. Ann. Statist. 22 1346-1370.
  • Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.