### Simultaneous variable selection and estimation in semiparametric modeling of longitudinal/clustered data

#### Abstract

We consider the problem of simultaneous variable selection and estimation in additive, partially linear models for longitudinal/clustered data. We propose an estimation procedure via polynomial splines to estimate the nonparametric components and apply proper penalty functions to achieve sparsity in the linear part. Under reasonable conditions, we obtain the asymptotic normality of the estimators for the linear components and the consistency of the estimators for the nonparametric components. We further demonstrate that, with proper choice of the regularization parameter, the penalized estimators of the non-zero coefficients achieve the asymptotic oracle property. The finite sample behavior of the penalized estimators is evaluated with simulation studies and illustrated by a longitudinal CD4 cell count data set.

#### Article information

Bernoulli, Volume 19, Number 1 (2013), 252-274.

First available in Project Euclid: 18 January 2013

https://projecteuclid.org/euclid.bj/1358531749

doi:10.3150/11-BEJ386

MR3019494

1259.62021

Ma, Shujie; Song, Qiongxia; Wang, Li. Simultaneous variable selection and estimation in semiparametric modeling of longitudinal/clustered data. Bernoulli 19 (2013), no. 1, 252--274. doi:10.3150/11-BEJ386. https://projecteuclid.org/euclid.bj/1358531749

#### Supplemental materials

• Supplementary material: Supplement to “Simultaneous variable selection and estimation in semiparametric modeling of longitudinal/clustered data”. We provide detailed proofs of Lemmas A.2 to A.7 stated in the Appendix.