Bernoulli

  • Bernoulli
  • Volume 20, Number 1 (2014), 141-163.

Efficient semiparametric estimation in generalized partially linear additive models for longitudinal/clustered data

Guang Cheng, Lan Zhou, and Jianhua Z. Huang

Full-text: Open access

Abstract

We consider efficient estimation of the Euclidean parameters in a generalized partially linear additive models for longitudinal/clustered data when multiple covariates need to be modeled nonparametrically, and propose an estimation procedure based on a spline approximation of the nonparametric part of the model and the generalized estimating equations (GEE). Although the model in consideration is natural and useful in many practical applications, the literature on this model is very limited because of challenges in dealing with dependent data for nonparametric additive models. We show that the proposed estimators are consistent and asymptotically normal even if the covariance structure is misspecified. An explicit consistent estimate of the asymptotic variance is also provided. Moreover, we derive the semiparametric efficiency score and information bound under general moment conditions. By showing that our estimators achieve the semiparametric information bound, we effectively establish their efficiency in a stronger sense than what is typically considered for GEE. The derivation of our asymptotic results relies heavily on the empirical processes tools that we develop for the longitudinal/clustered data. Numerical results are used to illustrate the finite sample performance of the proposed estimators.

Article information

Source
Bernoulli Volume 20, Number 1 (2014), 141-163.

Dates
First available in Project Euclid: 22 January 2014

Permanent link to this document
https://projecteuclid.org/euclid.bj/1390407283

Digital Object Identifier
doi:10.3150/12-BEJ479

Mathematical Reviews number (MathSciNet)
MR3160576

Zentralblatt MATH identifier
06282545

Keywords
GEE link function longitudinal data partially linear additive models polynomial splines

Citation

Cheng, Guang; Zhou, Lan; Huang, Jianhua Z. Efficient semiparametric estimation in generalized partially linear additive models for longitudinal/clustered data. Bernoulli 20 (2014), no. 1, 141--163. doi:10.3150/12-BEJ479. https://projecteuclid.org/euclid.bj/1390407283.


Export citation

References

  • [1] Bickel, P.J., Klaassen, C.A.J., Ritov, Y. and Wellner, J.A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Series in the Mathematical Sciences. Baltimore, MD: Johns Hopkins Univ. Press.
  • [2] Carroll, R.J., Maity, A., Mammen, E. and Yu, K. (2009). Efficient semiparametric marginal estimation for partially linear additive model for longitudinal/clustered data. Statistics in BioSciences 1 10–31.
  • [3] Chen, H. (1988). Convergence rates for parametric components in a partly linear model. Ann. Statist. 16 136–146.
  • [4] Chen, K. and Jin, Z. (2006). Partial linear regression models for clustered data. J. Amer. Statist. Assoc. 101 195–204.
  • [5] Cheng, G., Zhou, L. and Huang, J.Z. (2014). Supplement to “Efficient semiparametric estimation in generalized partially linear additive models for longitudinal/clustered data.” DOI:10.3150/12-BEJ479SUPP.
  • [6] de Boor, C. (2001). A Practical Guide to Splines, revised ed. Applied Mathematical Sciences 27. New York: Springer.
  • [7] Diggle, P.J., Heagerty, P.J., Liang, K.Y. and Zeger, S.L. (2002). Analysis of Longitudinal Data, 2nd ed. Oxford Statistical Science Series 25. Oxford: Oxford Univ. Press.
  • [8] Härdle, W., Liang, H. and Gao, J. (2000). Partially Linear Models. New York: Springer.
  • [9] He, X., Fung, W.K. and Zhu, Z. (2005). Robust estimation in generalized partial linear models for clustered data. J. Amer. Statist. Assoc. 100 1176–1184.
  • [10] He, X., Zhu, Z.Y. and Fung, W.K. (2002). Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89 579–590.
  • [11] Huang, J.Z., Wu, C.O. and Zhou, L. (2002). Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika 89 111–128.
  • [12] Huang, J.Z., Zhang, L. and Zhou, L. (2007). Efficient estimation in marginal partially linear models for longitudinal/clustered data using splines. Scand. J. Stat. 34 451–477.
  • [13] Kress, R. (1999). Linear Integral Equations, 2nd ed. Applied Mathematical Sciences 82. New York: Springer.
  • [14] Leng, C., Zhang, W. and Pan, J. (2010). Semiparametric mean-covariance regression analysis for longitudinal data. J. Amer. Statist. Assoc. 105 181–193. With supplementary material available online.
  • [15] Liang, K.Y. and Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22.
  • [16] Lin, X. and Carroll, R.J. (2001). Semiparametric regression for clustered data. Biometrika 88 1179–1185.
  • [17] Lin, X. and Carroll, R.J. (2001). Semiparametric regression for clustered data using generalized estimating equations. J. Amer. Statist. Assoc. 96 1045–1056.
  • [18] Sasieni, P. (1992). Nonorthogonal projections and their application to calculating the information in a partly linear Cox model. Scand. J. Stat. 19 215–233.
  • [19] Schumaker, L.L. (1981). Spline Functions: Basic Theory. New York: Wiley.
  • [20] Severini, T.A. and Staniswalis, J.G. (1994). Quasi-likelihood estimation in semiparametric models. J. Amer. Statist. Assoc. 89 501–511.
  • [21] Speckman, P. (1988). Kernel smoothing in partial linear models. J. R. Stat. Soc. Ser. B Stat. Methodol. 50 413–436.
  • [22] Stone, C.J. (1994). The use of polynomial splines and their tensor products in multivariate function estimation. Ann. Statist. 22 118–171.
  • [23] van de Geer, S. (2000). Empirical Processes in M-Estimation. Cambridge: Cambridge Univ. Press.
  • [24] Wang, N. (2003). Marginal nonparametric kernel regression accounting for within-subject correlation. Biometrika 90 43–52.
  • [25] Wang, N., Carroll, R.J. and Lin, X. (2005). Efficient semiparametric marginal estimation for longitudinal/clustered data. J. Amer. Statist. Assoc. 100 147–157.
  • [26] Zeger, S.L. and Diggle, P.J. (1994). Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics 50 689–699.
  • [27] Zhang, L. (2004). Efficient estimation in marginal partially linear models for longitudinal/clustered data using splines. Ph.D. thesis, Univ. Pennsylvania.

Supplemental materials

  • Supplementary material: Supplement to “Efficient semiparametric estimation in generalized partially linear additive models for longitudinal/clustered data”. The supplementary file (Cheng, Zhou and Huang [5]) includes the properties of the least favorable directions and the complete proofs of Theorems 1 and 2 together with some empirical processes results for the clustered/longitudinal data. The results of a simulation study that compares our method with that by Carroll et al. [2] are also included.