The Annals of Statistics

Analysis of variance, coefficient of determination and F-test for local polynomial regression

Li-Shan Huang and Jianwei Chen

Full-text: Open access


This paper provides ANOVA inference for nonparametric local polynomial regression (LPR) in analogy with ANOVA tools for the classical linear regression model. A surprisingly simple and exact local ANOVA decomposition is established, and a local R-squared quantity is defined to measure the proportion of local variation explained by fitting LPR. A global ANOVA decomposition is obtained by integrating local counterparts, and a global R-squared and a symmetric projection matrix are defined. We show that the proposed projection matrix is asymptotically idempotent and asymptotically orthogonal to its complement, naturally leading to an F-test for testing for no effect. A by-product result is that the asymptotic bias of the “projected” response based on local linear regression is of quartic order of the bandwidth. Numerical results illustrate the behaviors of the proposed R-squared and F-test. The ANOVA methodology is also extended to varying coefficient models.

Article information

Ann. Statist., Volume 36, Number 5 (2008), 2085-2109.

First available in Project Euclid: 13 October 2008

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression
Secondary: 62J10: Analysis of variance and covariance

Bandwidth nonparametric regression projection matrix R-squared smoothing splines varying coefficient models model checking


Huang, Li-Shan; Chen, Jianwei. Analysis of variance, coefficient of determination and F -test for local polynomial regression. Ann. Statist. 36 (2008), no. 5, 2085--2109. doi:10.1214/07-AOS531.

Export citation


  • [1] Azzalini, A., Bowman, A. W. and Hardle, W. (1989). On the use of nonparametric regression for model checking. Biometrika 76 1–11.
  • [2] Azzalini, A. and Bowman, A. W. (1993). On the use of nonparametric regression for checking linear relationships. J. Roy. Statist. Soc. Ser. B 55 549–557.
  • [3] Bjerve, S. and Doksum, K. (1993). Correlation curves: Measures of association as functions of covariate values. Ann. Statist. 21 890–902.
  • [4] Doksum, K., Blyth, S., Bradlow, E., Meng, X.-L. and Zhao, H. (1994). Correlation curves as local measures of variance explained by regression. J. Amer. Statist. Assoc. 89 571–572.
  • [5] Doksum, K. and Samarov, A. (1995). Global functionals and a measure of the explanatory power of covariates in regression. Ann. Statist. 23 1443–1473.
  • [6] Doksum, K. and Froda, S. M. (2000). Neighborhood correlation. J. Statist. Plann. Inference 91 267–294.
  • [7] Draper, N. R. and Smith, H. (1981). Applied Regression Analysis, 2nd ed. Wiley, New York.
  • [8] Eubank, R. L. (1999). Nonparametric Regression and Spline Smoothing, 2nd ed. Dekker, New York.
  • [9] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman and Hall, London.
  • [10] Fan, J., Zhang, C. and Zhang, J. (2001). Generalized likelihood ratio statistics and Wilks phenomenon. Ann. Statist. 29 153–193.
  • [11] Fan, J. and Zhang, W. (1999). Statistical estimation in varying coefficient models. Ann. Statist. 27 1491–1518.
  • [12] Gijbels, I. and Rousson, V. (2001). A nonparametric least-squares test for checking a polynomial relationship. Statist. Probab. Lett. 51 253–261.
  • [13] Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Chapman and Hall, London.
  • [14] Hastie, T. J. and Tibshirani, R. J. (1993). Varying-coefficient models. J. Roy. Statist. Soc. Ser. B 55 757–796.
  • [15] Hoover, D. R., Rice, J. A., Wu, C. O. and Yang, L. P. (1998). Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 85 809–822.
  • [16] Huang, L.-S. and Su, H. (2006). Nonparametric F-tests for nested global and local polynomial models. Technical Report 2006-08, Dept. of Biostatistics and Computational Biology, Univ. Rochester.
  • [17] Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation. Springer, New York.
  • [18] Mack, Y. P. and Silverman, B. W. (1982). Weak and strong uniform consistency of kernel regression and density estimation. Probab. Theory Related Fields 61 405–415.
  • [19] Mammen, E., Linton, O. B. and Nielsen, J. P. (1999). The existence and asymptotic properties of a backfitting projection algorithm under weak conditions. Ann. Statist. 27 1443–1490.
  • [20] Nielsen, J. P. and Sperlich, S. (2005). Smooth backfitting in practice. J. Roy. Statist. Soc. Ser. B 67 43–61.
  • [21] Qiu, P. (2003). A jump-preserving curve fitting procedure based on local piecewise-linear kernel estimation. J. Nonparametr. Statist. 15 437–453.
  • [22] Ramil-Novo, L. A. and González-Manteiga, W. (2000). F-tests and regression analysis of variance based on smoothing spline estimators. Statist. Sinica 10 819–837.
  • [23] Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge Univ. Press, London.
  • [24] Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, London.
  • [25] Simonoff, J. S. (1996). Smoothing Methods in Statistics. Springer, New York.
  • [26] Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.
  • [27] Zhang, C. (2003). Calibrating the degrees of freedom for automatic data smoothing and effective curve checking. J. Amer. Statist. Assoc. 98 609–628.