The Annals of Statistics

Testing the suitability of polynomial models in errors-in-variables problems

Peter Hall and Yanyuan Ma

Full-text: Open access

Abstract

A low-degree polynomial model for a response curve is used commonly in practice. It generally incorporates a linear or quadratic function of the covariate. In this paper we suggest methods for testing the goodness of fit of a general polynomial model when there are errors in the covariates. There, the true covariates are not directly observed, and conventional bootstrap methods for testing are not applicable. We develop a new approach, in which deconvolution methods are used to estimate the distribution of the covariates under the null hypothesis, and a “wild” or moment-matching bootstrap argument is employed to estimate the distribution of the experimental errors (distinct from the distribution of the errors in covariates). Most of our attention is directed at the case where the distribution of the errors in covariates is known, although we also discuss methods for estimation and testing when the covariate error distribution is estimated. No assumptions are made about the distribution of experimental error, and, in particular, we depart substantially from conventional parametric models for errors-in-variables problems.

Article information

Source
Ann. Statist., Volume 35, Number 6 (2007), 2620-2638.

Dates
First available in Project Euclid: 22 January 2008

Permanent link to this document
https://projecteuclid.org/euclid.aos/1201012974

Digital Object Identifier
doi:10.1214/009053607000000361

Mathematical Reviews number (MathSciNet)
MR2382660

Zentralblatt MATH identifier
1129.62042

Subjects
Primary: 62G08: Nonparametric regression 62G09: Resampling methods 62G10: Hypothesis testing 62G20: Asymptotic properties

Keywords
Bandwidth bootstrap deconvolution distribution estimation hypothesis testing ill-posed problem kernel methods measurement error moment-matching bootstrap smoothing regularization wild bootstrap

Citation

Hall, Peter; Ma, Yanyuan. Testing the suitability of polynomial models in errors-in-variables problems. Ann. Statist. 35 (2007), no. 6, 2620--2638. doi:10.1214/009053607000000361. https://projecteuclid.org/euclid.aos/1201012974


Export citation

References

  • Bickel, P. J. and Ritov, Y. (1987). Efficient estimation in the errors-in-variables model. Ann. Statist. 15 513–540.
  • Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density. J. Amer. Statist. Assoc. 83 1184–1186.
  • Chan, L. K. and Mak, T. K. (1985). On the polynomial functional relationship. J. Roy. Statist. Soc. Ser. B 47 510–518.
  • Cheng, C.-L. and Kukush, A. G. (2004). A goodness-of-fit test for a polynomial errors-in-variables model. Ukranian Math. J. 56 641–661.
  • Cheng, C.-L. and Schneeweiss, H. (1998). Polynomial regression with errors in the variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 189–199.
  • Cheng, C.-L., Schneeweiss, H. and Thamerus, M. (2000). A small sample estimator for a polynomial regression with errors in the variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 699–709.
  • Delaigle, A. and Gijbels, I. (2002). Estimation of integrated squared density derivatives from a contaminated sample. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 869–886.
  • Delaigle, A. and Gijbels, I. (2004). Practical bandwidth selection in deconvolution kernel density estimation. Comput. Statist. Data Anal. 45 249–267.
  • Domínguez, M. A. and Lobato, I. N. (2003). Testing the martingale difference hypothesis. Econometric Rev. 22 351–377.
  • Efromovich, S. (1994). Nonparametric curve estimation from indirect observations. In Computing Science and Statistics. Proc. 26th Symp. on the Interface 196–200. Interface Foundation of North America, Fairfax Station, VA.
  • Efromovich, S. (1999). Nonparametric Curve Estimation: Methods, Theory and Applications. Springer, New York.
  • Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19 1257–1272.
  • Fan, J. and Truong, Y. K. (1993). Nonparametric regression with errors in variables. Ann. Statist. 21 1900–1925.
  • Fan, J., Zhang, C. M. and Zhang, J. (2001). Generalized likelihood ratio statistics and Wilks phenomenon. Ann. Statist. 29 153–193.
  • Fan, Y. and Li, Q. (2002). A consistent model specification test based on the kernel sum of squares of residuals. Econometric Rev. 21 337–352.
  • Flachaire, E. (2002). Bootstrapping heteroskedasticity consistent covariance matrix estimator. Comput. Statist. 17 501–506.
  • Fuller, W. A. (1987). Measurement Error Models. Wiley, New York.
  • González Manteiga, W., Martínez Miranda, M. D. and Pérez González, A. (2004). The choice of smoothing parameter in nonparametric regression through wild bootstrap. Comput. Statist. Data Anal. 47 487–515.
  • Härdle, W. and Mammen, E. (1993). Comparing nonparametric versus parametric regression fits. Ann. Statist. 21 1926–1947.
  • Kauermann, G. and Opsomer, J. D. (2003). Local likelihood estimation in generalized additive models. Scand. J. Statist. 30 317–337.
  • Kukush, A., Schneeweiss, H. and Wolf, R. (2005). Relative efficiency of three estimators in a polynomial regression with measurement errors. J. Statist. Plann. Inference 127 179–203.
  • Li, Q., Hsiao, C. and Zinn, J. (2003). Consistent specification tests for semiparametric/nonparametric models based on series estimation methods. J. Econometrics 112 295–325.
  • Liang, H., Härdle, W. and Carroll, R. J. (1999). Estimation in a semiparametric partially linear errors-in-variables model. Ann. Statist. 27 1519–1535.
  • Liu, R. Y. (1988). Bootstrap procedures under some non-i.i.d. models. Ann. Statist. 16 1696–1708.
  • Mammen, E. (1993). Bootstrap and wild bootstrap for high-dimensional linear models. Ann. Statist. 21 255–285.
  • Prášková, Z. (2003). Wild bootstrap in RCA(1) model. Kybernetika (Prague) 39 1–12.
  • Stefanski, L. A. and Carroll, R. J. (1987). Conditional scores and optimal scores for generalized linear measurement-error models. Biometrika 74 703–716.
  • Stefanski, L. A. and Carroll, R. J. (1990). Deconvoluting kernel density estimators. Statistics 21 169–184.
  • Taupin, M. (2001). Semi-parametric estimation in the nonlinear structural errors-in-variables model. Ann. Statist. 29 66–93.
  • Tsiatis, A. A. and Ma, Y. (2004). Locally efficient semiparametric estimators for functional measurement error models. Biometrika 91 835–848.