Abstract
Scatterplot smoothers estimate a regression function y = f(x) by local averaging of the observed data points (xi, yi). In using a smoother, the statistician must choose a “window width,” a crucial smoothing parameter that says just how locally the averaging is done. This paper concerns the databased choice of a smoothing parameter for splinelike smoothers, focusing on the comparison of two popular methods, Cp and generalized maximum likelihood. The latter is the MLE within a normaltheory empirical Bayes model. We show that Cp is also maximum likelihood within a closely related nonnormal family, both methods being examples of a class of selection criteria. Each member of the class is the MLE within its own oneparameter curved exponential family. Exponential family theory facilitates a finitesample nonasymptotic comparison of the criteria. In particular it explains the eccentric behavior of Cp, which even in favorable circumstances can easily select small window widths and wiggly estimates of f(x). The theory leads to simple geometric pictures of both Cp and MLE that are valid whether or not one believes in the probability models.
Citation
Bradley Efron. "Selection criteria for scatterplot smoothers." Ann. Statist. 29 (2) 470 - 505, April 2001. https://doi.org/10.1214/aos/1009210549
Information