Statistical Science

Kernel Smoothers: An Overview of Curve Estimators for the First Graduate Course in Nonparametric Statistics

William R. Schucany

Full-text: Open access

Abstract

An introduction to nonparametric regression is accomplished with selected real data sets, statistical graphics and simulations from known functions. It is pedagogically effective for many to have some initial intuition about what the techniques are and why they work. Visual displays of small examples along with the plots of several types of smoothers are a good beginning. Some students benefit from a brief historical development of the topic, provided that they are familiar with other methodology, such as linear regression. Ultimately, one must engage the formulas for some of the linear curve estimators. These mathematical expressions for local smoothers are more easily understood after the student has seen a graph and a description of what the procedure is actually doing. In this article there are several such figures. These are mostly scatterplots of a single response against one predictor. Kernel smoothers have series expansions for bias and variance. The leading terms of those expansions yield approximate expressions for asymptotic mean squared error. In turn these provide one criterion for selection of the bandwidth. This choice of a smoothing parameter is done a rich variety of ways in practice. The final sections cover alternative approaches and extensions. The survey is supplemented with citations to some excellent books and articles. These provide the student with an entry into the literature, which is rapidly developing in traditional print media as well as on line.

Article information

Source
Statist. Sci., Volume 19, Number 4 (2004), 663-675.

Dates
First available in Project Euclid: 18 April 2005

Permanent link to this document
https://projecteuclid.org/euclid.ss/1113832731

Digital Object Identifier
doi:10.1214/088342304000000756

Mathematical Reviews number (MathSciNet)
MR2185588

Zentralblatt MATH identifier
1127.62330

Keywords
Local polynomial regression AIC variable bandwidths cross validation windows

Citation

Schucany, William R. Kernel Smoothers: An Overview of Curve Estimators for the First Graduate Course in Nonparametric Statistics. Statist. Sci. 19 (2004), no. 4, 663--675. doi:10.1214/088342304000000756. https://projecteuclid.org/euclid.ss/1113832731


Export citation

References

  • Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. Amer. Statist. 46 175--185.
  • Bartlett, M. S. (1963). Statistical estimation of density functions. Sankhyā Ser. A 25 245--254.
  • Benedetti, J. K. (1977). On the nonparametric estimation of regression functions. J. Roy. Statist. Soc. Ser. B 39 248--253.
  • Chu, C.-K. and Marron, J. S. (1991). Choosing a kernel regression estimator (with discussion). Statist. Sci. 6 404--436.
  • Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. J. Amer. Statist. Assoc. 74 829--836.
  • Epanechnikov, V. A. (1969). Nonparametric estimation of a multivariate probability density. Theory Probab. Appl. 14 153--158.
  • Eubank, R. L. (1999). Nonparametric Regression and Spline Smoothing, 2nd ed. Dekker, New York.
  • Fan, J. (1992). Design-adaptive nonparametric regression. J. Amer. Statist. Assoc. 87 998--1004.
  • Fan, J. and Gijbels, I. (1995). Data-driven bandwidth selection in local polynomial fitting: Variable bandwidth and spatial adaptation. J. Roy. Statist. Soc. Ser. B 57 371--394.
  • Fan, J. and Gijbels, I. (1996). Local Polynomial Modeling and Its Applications. Chapman and Hall, London.
  • Gasser, T. and Müller, H.-G. (1979). Kernel estimation of regression functions. Smoothing Techniques for Curve Estimation. Lecture Notes in Math. 757 23--68. Springer, Heidelberg.
  • Gerard, P. D. and Schucany, W. R. (1997). Locating exotherms in differential thermal analysis with nonparametric regression. J. Agric. Biol. Environ. Stat. 2 255--268.
  • Hart, J. D. (1997). Nonparametric Smoothing and Lack-of-Fit Tests. Springer, New York.
  • Hart, J. D. and Lee, C.-L. (2005). Robustness of one-sided cross-validation to autocorrelation. J. Multivariate Anal. 92 77--96.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York.
  • Hodges, J. L., Jr. and Lehmann, E. L. (1956). The efficiency of some nonparametric competitors of the $t$-test. Ann. Math. Statist. 27 324--335.
  • Hurvich, C. M., Simonoff, J. S. and Tsai, C.-L. (1998). Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 271--293.
  • Hurvich, C. M. and Tsai, C.-L. (1989). Regression and time series model selection in small samples. Biometrika 76 297--307.
  • Jia, A. and Schucany, W. R. (2004). Recursive partitioning for kernel smoothers: A tree-based approach for estimating variable bandwidths in local linear regression. Unpublished manuscript.
  • Lin, X., Wang, N., Welsh, A. H. and Carroll, R. J. (2004). Equivalent kernels of smoothing splines in nonparametric regression for clustered/longitudinal data. Biometrika 91 177--193.
  • Loader, C. R. (1996). Change point estimation using nonparametric regression. Ann. Statist. 24 1667--1678.
  • Loader, C. R. (1999). Local Regression and Likelihood. Springer, New York.
  • Müller, H.-G. (1987). Weighted local regression and kernel methods for nonparametric curve fitting. J. Amer. Statist. Assoc. 82 231--238.
  • Müller, H.-G. (1992). Change-points in nonparametric regression analysis. Ann. Statist. 20 737--761.
  • Nadaraya, E. A. (1965). On nonparametric estimates of density functions and regression curves. Theory Probab. Appl. 10 186--190.
  • Parzen, E. (1962). On estimation of a probability density function and mode. Ann. Math. Statist. 33 1065--1076.
  • Pitblado, J. (2000). Estimating partially variable bandwidths in local linear regression using an information criterion. Ph.D. dissertation, Dept. Statistical Science, Southern Methodist Univ.
  • Priestley, M. B. and Chao, M. T. (1972). Nonparametric function fitting. J. Roy. Statist. Soc. Ser. B 34 385--392.
  • Ramsay, J. O. and Silverman, B. W. (1997). Functional Data Analysis. Springer, New York.
  • Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 27 832--837.
  • Ruppert, D. (1997). Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation. J. Amer. Statist. Assoc. 92 1049--1062.
  • Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge Univ. Press.
  • Schlee, W. (1988). Regressograms. Encyclopedia of Statistical Sciences 8 1--3. Wiley, New York.
  • Scott, D. W. (1992). Multivariate Density Estimation. Wiley, New York.
  • Sheather, S. (2005). Density estimation. Statist. Sci. 19 588--597.
  • Signorini, D. F. and Jones, M. C. (2004). Kernel estimators for univariate binary regression. J. Amer. Statist. Assoc. 99 119--126.
  • Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.
  • Spencer, J. (1904). On the graduation of rates of sickness and mortality. J. Institute of Actuaries 38 334--343.
  • Stone, C. J. (1977). Consistent nonparametric regression (with discussion). Ann. Statist. 5 595--645.
  • Tukey, J. W. (1961). Curves as parameters and touch estimation. Proc. Fourth Berkeley Symp. Math. Statist. Probab. 1 681--694. Univ. California Press, Berkeley.
  • Ullah, A. (1985). Specification analysis of econometric models. J. Quantitative Economics 1 187--209.
  • Wahba, G. and Wold, S. (1975). A completely automatic French curve. Comm. Statist. 4 1--17.
  • Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.
  • Watson, G. S. (1964). Smooth regression analysis. Sankhyā Ser. A 26 359--372.
  • Welsh, A. H., Lin, X. and Carroll, R. J. (2002). Marginal longitudinal nonparametric regression: Locality and efficiency of spline and kernel methods. J. Amer. Statist. Assoc. 97 482--493.