Electronic Journal of Statistics

Fixed and random effects selection in nonparametric additive mixed models

Randy C. S. Lai, Hsin-Cheng Huang, and Thomas C. M. Lee

Full-text: Open access


This paper considers the problem of model selection in a nonparametric additive mixed modeling framework. The fixed effects are modeled nonparametrically using truncated series expansions with B-spline basis. Estimation and selection of such nonparametric fixed effects are simultaneously achieved by using the adaptive group lasso methodology, while the random effects are selected by a traditional backward selection mechanism. To facilitate the automatic selection of model dimension, computable expressions for the degrees of freedom for both the fixed and random effects components are derived, and the Bayesian Information criterion (BIC) is used to select the final model choice. Theoretically it is shown that this BIC model selection method is consistent, while computationally a practical algorithm is developed for solving the optimization problem involved. Simulation results show that the proposed methodology is often capable of selecting the correct significant fixed and random effects components, especially when the sample size and/or signal to noise ratio are not too small. The new method is also applied to two real data sets.

Article information

Electron. J. Statist., Volume 6 (2012), 810-842.

First available in Project Euclid: 9 May 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression

Adaptive group lasso additive mixed model Bayesian information criterion consistency


Lai, Randy C. S.; Huang, Hsin-Cheng; Lee, Thomas C. M. Fixed and random effects selection in nonparametric additive mixed models. Electron. J. Statist. 6 (2012), 810--842. doi:10.1214/12-EJS695. https://projecteuclid.org/euclid.ejs/1336568107

Export citation


  • Bondell, H. D., Krishna, A. and Ghosh, S. K. (2010) Joint variable selection for fixed and random effects in linear mixed-effects models., Biometrics, 66, 1069–1077.
  • de Boor, C. (2001), A practical guide to splines. New York: Springer Verlag.
  • Chen, Z. and Dunson, D. B. (2003) Random effects selection in linear mixed models., Biometrics, 59, 762–769.
  • Diggle, P., Heagerty, P., Liang, K. and Zeger, S. (2002), Analysis of longitudinal data. USA: Oxford University Press.
  • Fahrmeir, L. and Lang, S. (2001) Bayesian inference for generalized additive mixed models based on markov random field priors., Applied Statistics, 201–220.
  • Harville, D. A. (1974) Bayesian inference for variance components using only error contrasts., Biometrika, 61, 383–385.
  • Hedeker, D. and Gibbons, R. (2006), Longitudinal data analysis. New York: Wiley.
  • Huang, J., Horowitz, J. and Wei, F. (2010) Variable selection in nonparametric additive models., The Annals of Statistics, 38, 2282–2313.
  • Kinney, S. K. and Dunson, D. B. (2007) Fixed and random effects selection in linear and logistic models., Biometrics, 63, 690–698.
  • Laird, N. and Ware, J. (1982) Random-effects models for longitudinal data., Biometrics, 38, 963–974.
  • Liang, K. and Zeger, S. (1986) Longitudinal data analysis using generalized linear models., Biometrika, 73, 13.
  • Lin, X. and Zhang, D. (1999) Inference in generalized additive mixed models by using smoothing splines., Journal of the Royal Statistical Society Series B, 61, 381–400.
  • Lin, Y. and Zhang, H. (2006) Component selection and smoothing in multivariate nonparametric regression., The Annals of Statistics, 34, 2272.
  • Lindstrom, M. and Bates, D. (1988) Newton-Raphson and EM algorithms for linear mixed-effects models for repeated-measures data., Journal of the American Statistical Association, 1014–1022.
  • Meier, L., van de Geer, S. and Buhlmann, P. (2008) The group lasso for logistic regression., Journal of the Royal Statistical Society Series B, 70, 53.
  • Meier, L., Van de Geer, S., Buhlmann, P. and Zurich, E. (2009) High-dimensional additive modeling., The Annals of Statistics, 37, 3779–3821.
  • Pu, W. and Niu, X.-F. (2006) Selecting mixed-effects models based on a generalized information criterion., JMA, 97, 733–758.
  • Ravikumar, P., Lafferty, J., Liu, H. and Wasserman, L. (2009) Sparse additive models., Journal of the Royal Statistical Society Series B, 71, 1009–1030.
  • Reisby, N., Gram, L., Bech, P., Nagy, A., Petersen, G., Ortmann, J., Ibsen, I., Dencker, S., Jacobsen, O., Krautwald, O., et al. (1977) Imipramine: clinical effects and pharmacokinetic variability. Psychopharmacology, 54, 263–272.
  • Ruppert, D., Wand, M. and Carroll, R. (2003), Semiparametric regression. New York: Cambridge University Press.
  • Shen, X. and Ye, J. (2002) Adaptive model selection., Journal of the American Statistical Association, 97, 210–221.
  • Stone, C. (1985) Additive regression and other nonparametric models., The Annals of Statistics, 13, 689–705.
  • Stone, C. (1986) The dimensionality reduction principle for generalized additive models., The Annals of Statistics, 590–606.
  • Tibshirani, R. (1996) Regression shrinkage and selection via the lasso., Journal of the Royal Statistical Society Series B, 58, 267–288.
  • Van der Vaart, A. and Wellner, J. (1996), Weak convergence and empirical processes: with applications to statistics. New York: Springer Verlag.
  • Van De Geer, S. (2008) High-dimensional generalized linear models and the lasso., The Annals of Statistics, 36, 614.
  • Wand, M. (2003) Smoothing and mixed models., Computational Statistics, 18, 223–250.
  • Wang, H., Li, R. and Tsai, C. (2007a) Tuning parameter selectors for the smoothly clipped absolute deviation method., Biometrika, 94, 553–568.
  • Wang, L., Chen, G. and Li, H. (2007b) Group scad regression analysis for microarray time course gene expression data., Bioinformatics, 23, 1486.
  • Wei, F. and Huang, J. (2008) Consistent group selection in high-dimensional linear regression., Tech. rep., Department of Statistics and Actuarial Science, University of Iowa.
  • Yuan, M. and Lin, Y. (2006) Model selection and estimation in regression with grouped variables., Journal of the Royal Statistical Society Series B, 68, 49–67.
  • Zeger, S. and Liang, K. (1986) Longitudinal data analysis for discrete and continuous outcomes., Biometrics, 42, 121–130.
  • Zhang, D., Lin, X., Raz, J. and Sowers, M. (1998) Semiparametric stochastic mixed models for longitudinal data., Journal of the American Statistical Association, 93.
  • Zhou, S., Shen, X. and Wolfe, D. (1998) Local asymptotics for regression splines and confidence regions., The Annals of Statistics, 26, 1760–1782.
  • Zou, H. (2006) The adaptive lasso and its oracle properties., Journal of the American Statistical Association, 101, 1418–1429.