Bernoulli

  • Bernoulli
  • Volume 16, Number 1 (2010), 274-300.

Variable selection in measurement error models

Yanyuan Ma and Runze Li

Full-text: Open access

Abstract

Measurement error data or errors-in-variable data have been collected in many studies. Natural criterion functions are often unavailable for general functional measurement error models due to the lack of information on the distribution of the unobservable covariates. Typically, the parameter estimation is via solving estimating equations. In addition, the construction of such estimating equations routinely requires solving integral equations, hence the computation is often much more intensive compared with ordinary regression models. Because of these difficulties, traditional best subset variable selection procedures are not applicable, and in the measurement error model context, variable selection remains an unsolved issue. In this paper, we develop a framework for variable selection in measurement error models via penalized estimating equations. We first propose a class of selection procedures for general parametric measurement error models and for general semi-parametric measurement error models, and study the asymptotic properties of the proposed procedures. Then, under certain regularity conditions and with a properly chosen regularization parameter, we demonstrate that the proposed procedure performs as well as an oracle procedure. We assess the finite sample performance via Monte Carlo simulation studies and illustrate the proposed methodology through the empirical analysis of a familiar data set.

Article information

Source
Bernoulli, Volume 16, Number 1 (2010), 274-300.

Dates
First available in Project Euclid: 12 February 2010

Permanent link to this document
https://projecteuclid.org/euclid.bj/1265984712

Digital Object Identifier
doi:10.3150/09-BEJ205

Mathematical Reviews number (MathSciNet)
MR2648758

Zentralblatt MATH identifier
1200.62071

Keywords
errors in variables estimating equations measurement error models non-concave penalty function SCAD semi-parametric methods

Citation

Ma, Yanyuan; Li, Runze. Variable selection in measurement error models. Bernoulli 16 (2010), no. 1, 274--300. doi:10.3150/09-BEJ205. https://projecteuclid.org/euclid.bj/1265984712


Export citation

References

  • Bickel, P.J. and Ritov, A.J.C. (1987). Efficient estimation in the errors-in-variables model. Ann. Statist. 15 513–540.
  • Cai, J., Fan, J., Li, R. and Zhou, H. (2005). Variable selection for multivariate failure time data. Biometrika 92 303–316.
  • Candès, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n (with discussion). Ann. Statist. 35 2313–2392.
  • Carroll, R.J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density. J. Amer. Statist. Assoc. 83 1184–1186.
  • Carroll, R.J., Ruppert, D., Stefanski, L.A. and Crainiceanu, C. (2006). Measurement Error in Nonlinear Models: A Modern Perspective, 2nd ed. London: CRC Press.
  • Delaigle, A. and Hall, P. (2007). Using SIMEX for smoothing-parameter choice in errors-in-variables problems. J. Amer. Statist. Assoc. 103 280–287.
  • Delaigle, A. and Meister, A. (2007). Nonparametric regression estimation in the heteroscedastic errors-in-variables problem. J. Amer. Statist. Assoc. 102 1416–1426.
  • Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19 1257–1272.
  • Fan, J. and Huang, T. (2005). Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11 1031–1057.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan, J. and Lv, J. (2008). Sure independence screening for ultra-high dimensional feature space (with discussion). J. Roy. Statist. Soc. Ser. B 70 849–911.
  • Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928–961.
  • Hall, P. and Ma, Y. (2007). Semiparametric estimators of functional measurement error models with unknown error. J. Roy. Statist. Soc. Ser. B 69 429–446.
  • Härdle, W., Liang, H. and Gao, J. (2000). Partially Linear Models. Heidelberg: Springer Physica.
  • Hunter, D. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617–1642.
  • Kannel, W.B., Newton, J.D., Wentworth, D., Thomas, H.E., Stamler, J., Hulley, S.B. and Kjelsberg, M.O. (1986). Overall and coronary heart disease mortality rates in relation to major risk factors in 325,348 men screened for MRFIT. Am. Heart J. 112 825–836.
  • Lam, C. and Fan, J. (2008). Profile-Kernel likelihood inference with diverging number of parameters. Ann. Statist. 36 2232–2260.
  • Li, R. and Liang, H. (2008). Variable selection in semiparametric regression modeling. Ann. Statist. 36 261–286.
  • Li, R. and Nie, L. (2007). A new estimation procedure for partially nonlinear model via a mixed effects approach. Canad. J. Statist. 35 399–411.
  • Li, R. and Nie, L. (2008). Efficient statistical inference procedures for partially nonlinear models and their applications. Biometrics 64 904–911.
  • Liang, H., Härdle, W. and Carroll R.J. (1999). Estimation in a semiparametric partially linear errors-in-variables model. Ann. Statist. 27 1519–1535.
  • Liang, H. and Li, R. (2009). Variable selection for partially linear models with measurement errors. J. Amer. Statist. Assoc. 104 234–248.
  • Ma, Y. and Carroll, R.J. (2006). Locally efficient estimators for eemiparametric models with measurement error. J. Amer. Statist. Assoc. 101 1465–1474.
  • Ma, Y. and Li, R. (2007). Variable selection in measurement error models. Technical report. Available at http://www2.unine.ch/webdav/site/statistics/shared/documents/v10.pdf.
  • Ma, Y. and Tsiatis, A.A. (2006). Closed form semiparametric estimators for measurement error models. Statist. Sinica 16 183–193.
  • Severini, T.A. and Staniswalis, J.G. (1994). Quasilikelihood estimation in semiparametric models. J. Amer. Statist. Assoc. 89 501–511.
  • Stefanski, L.A. and Carroll, R.J. (1987). Conditional scores and optimal scores for generalized linear measurement-error models. Biometrika 74 703–716.
  • Tsiatis, A.A. and Ma, Y. (2004). Locally efficient semiparametric estimators for functional measurement error models. Biometrika 91 835–848.
  • Wang, H., Li, R. and Tsai, C. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94 553–568.
  • Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Ann. Statist. 36 1509–1566.