The Annals of Statistics

New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models

Bo Kai, Runze Li, and Hui Zou

Full-text: Open access

Abstract

The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors. In addition, it is shown that the loss in efficiency is at most 11.1% for estimating varying coefficient functions and is no greater than 13.6% for estimating parametric components. To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma beta-carotene level data.

Article information

Source
Ann. Statist. Volume 39, Number 1 (2011), 305-332.

Dates
First available in Project Euclid: 3 December 2010

Permanent link to this document
https://projecteuclid.org/euclid.aos/1291388377

Digital Object Identifier
doi:10.1214/10-AOS842

Mathematical Reviews number (MathSciNet)
MR2797848

Zentralblatt MATH identifier
1209.62074

Subjects
Primary: 62G05: Estimation 62G08: Nonparametric regression
Secondary: 62G20: Asymptotic properties

Keywords
Asymptotic relative efficiency composite quantile regression semiparametric varying-coefficient partially linear model oracle properties variable selection

Citation

Kai, Bo; Li, Runze; Zou, Hui. New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Statist. 39 (2011), no. 1, 305--332. doi:10.1214/10-AOS842. https://projecteuclid.org/euclid.aos/1291388377.


Export citation

References

  • [1] Bradic, J., Fan, J. and Wang, W. (2010). Penalized composite quasi-likelihood for ultrahigh-dimensional variable selection. Available at arXiv:0912.5200v1.
  • [2] Cai, Z. and Xu, X. (2009). Nonparametric quantile estimations for dynamic smooth coefficient models. J. Amer. Statist. Assoc. 104 371–383.
  • [3] Carroll, R., Fan, J., Gijbels, I. and Wand, M. (1997). Generalized partially linear single-index models. J. Amer. Statist. Assoc. 92 477–489.
  • [4] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman & Hall, London.
  • [5] Fan, J. and Huang, T. (2005). Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11 1031–1057.
  • [6] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1361.
  • [7] Fan, J. and Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. In Proceedings of the International Congress of Mathematicians (M. Sanz-Sole, J. Soria, J. Varona and J. Verdera, eds.) III 595–622. Eur. Math. Soc., Zürich.
  • [8] Geyer, C. (1994). On the asymptotics of constrained M-estimation. Ann. Statist. 22 1993–2010.
  • [9] Härdle, W., Liang, H. and Gao, J. (2000). Partially Linear Models. Physica Verlag, Heidelberg.
  • [10] He, X. and Shi, P. (1996). Bivariate tensor-product B-splines in a partly linear model. J. Multivariate Anal. 58 162–181.
  • [11] He, X., Zhu, Z. and Fung, W. (2002). Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89 579–590.
  • [12] Hunter, D. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617–1642.
  • [13] Kai, B., Li, R. and Zou, H. (2010). Local composite quantile regression smoothing: An efficient and safe alternative to local polynomial regression. J. Roy. Statist. Soc. Ser. B 72 49–69.
  • [14] Knight, K. (1998). Limiting distributions for L1 regression estimators under general conditions. Ann. Statist. 26 755–770.
  • [15] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356–1378.
  • [16] Koenker, R. (1984). A note on L-estimates for linear models. Statist. Probab. Lett. 2 323-325.
  • [17] Koenker, R. (2005). Quantile Regression. Cambridge Univ. Press, Cambridge.
  • [18] Lam, C. and Fan, J. (2008). Profile-kernel likelihood inference with diverging number of parameters. Ann. Statist. 36 2232–2260.
  • [19] Lee, S. (2003). Efficient semiparametric estimation of a partially linear quantile regression model. Econometric Theory 19 1–31.
  • [20] Leng, C. (2010). Variable selection and coefficient estimation via regularized rank regression. Statist. Sinica 20 167–181.
  • [21] Li, R. and Liang, H. (2008). Variable selection in semiparametric regression modeling. Ann. Statist. 36 261–286.
  • [22] Li, Y. and Zhu, J. (2007). L1-norm quantile regression. J. Comput. Graph. Statist. 17 163–185.
  • [23] Mack, Y. and Silverman, B. (1982). Weak and strong uniform consistency of kernel regression estimates. Probab. Theory Related Fields 61 405–415.
  • [24] Nierenberg, D., Stukel, T., Baron, J., Dain, B. and Greenberg, E. (1989). Determinants of plasma levels of beta-carotene and retinol. American Journal of Epidemiology 130 511–521.
  • [25] Pollard, D. (1991). Asymptotics for least absolute deviation regression estimators. Econometric Theory 7 186–199.
  • [26] Ruppert, D., Wand, M. and Carroll, R. (2003). Semiparametric Regression. Cambridge Univ. Press, Cambridge.
  • [27] Wang, H., Li, R. and Tsai, C. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94 553–568.
  • [28] Wang, H. and Xia, Y. (2009). Shrinkage estimation of the varying coefficient model. J. Amer. Statist. Assoc. 104 747–757.
  • [29] Wang, L. and Li, R. (2009). Weighted Wilcoxon-type smoothly clipped absolute deviation method. Biometrics 65 564–571.
  • [30] Wu, Y. and Liu, Y. (2009). Variable selection in quantile regression. Statist. Sinica 19 801–817.
  • [31] Xia, Y., Zhang, W. and Tong, H. (2004). Efficient estimation for semivarying-coefficient models. Biometrika 91 661–681.
  • [32] Yatchew, A. (2003). Semiparametric Regression for the Applied Econometrician. Cambridge Univ. Press, Cambridge.
  • [33] Zhang, W., Lee, S. and Song, X. (2002). Local polynomial fitting in semivarying coefficient model. J. Multivariate Anal. 82 166–188.
  • [34] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Ann. Statist. 36 1509–1533.
  • [35] Zou, H. and Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. Ann. Statist. 36 1108–1126.