The Annals of Statistics

Asymptotic optimality and efficient computation of the leave-subject-out cross-validation

Ganggang Xu and Jianhua Z. Huang

Full-text: Open access


Although the leave-subject-out cross-validation (CV) has been widely used in practice for tuning parameter selection for various nonparametric and semiparametric models of longitudinal data, its theoretical property is unknown and solving the associated optimization problem is computationally expensive, especially when there are multiple tuning parameters. In this paper, by focusing on the penalized spline method, we show that the leave-subject-out CV is optimal in the sense that it is asymptotically equivalent to the empirical squared error loss function minimization. An efficient Newton-type algorithm is developed to compute the penalty parameters that optimize the CV criterion. Simulated and real data are used to demonstrate the effectiveness of the leave-subject-out CV in selecting both the penalty parameters and the working correlation matrix.

Article information

Ann. Statist., Volume 40, Number 6 (2012), 3003-3030.

First available in Project Euclid: 8 February 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G08: Nonparametric regression
Secondary: 62G05: Estimation 62G20: Asymptotic properties 62H12: Estimation 41A15: Spline approximation

Cross-validation generalized estimating equations multiple smoothing parameters penalized splines working correlation matrices


Xu, Ganggang; Huang, Jianhua Z. Asymptotic optimality and efficient computation of the leave-subject-out cross-validation. Ann. Statist. 40 (2012), no. 6, 3003--3030. doi:10.1214/12-AOS1063.

Export citation


  • Anderson, T. W. and Das Gupta, S. (1963). Some inequalities on characteristic roots of matrices. Biometrika 50 522–524.
  • Bénasséni, J. (2002). A complementary proof of an eigenvalue property in correspondence analysis. Linear Algebra Appl. 354 49–51.
  • Cai, T. T. and Yuan, M. (2011). Optimal estimation of the mean function based on discretely sampled functional data: Phase transition. Ann. Statist. 39 2330–2355.
  • Chiang, C.-T., Rice, J. A. and Wu, C. O. (2001). Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables. J. Amer. Statist. Assoc. 96 605–619.
  • Claeskens, G., Krivobokova, T. and Opsomer, J. D. (2009). Asymptotic properties of penalized spline estimators. Biometrika 96 529–544.
  • Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31 377–403.
  • Diggle, P. J., Heagerty, P. J., Liang, K.-Y. and Zeger, S. L. (2002). Analysis of Longitudinal Data, 2nd ed. Oxford Statistical Science Series 25. Oxford Univ. Press, Oxford.
  • Fan, J. and Zhang, J.-T. (2000). Two-step estimation of functional linear models with applications to longitudinal data. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 303–322.
  • Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Monographs on Statistics and Applied Probability 58. Chapman & Hall, London.
  • Gu, C. (2002). Smoothing Spline ANOVA Models. Springer, New York.
  • Gu, C. and Ma, P. (2005). Optimal smoothing in nonparametric mixed-effect models. Ann. Statist. 33 1357–1379.
  • Gu, C. and Wahba, G. (1991). Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM J. Sci. Statist. Comput. 12 383–398.
  • Han, C. and Gu, C. (2008). Optimal smoothing with correlated data. Sankhyā 70 38–72.
  • Hoover, D. R., Rice, J. A., Wu, C. O. and Yang, L.-P. (1998). Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 85 809–822.
  • Huang, J. Z., Wu, C. O. and Zhou, L. (2002). Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika 89 111–128.
  • Kaslow, R. A., Ostrow, D. G., Detels, R., Phair, J. P., Polk, B. F. and Rinaldo, C. R. Jr. (1987). The multicenter AIDS Cohort study: Rationale, organization, and selected characteristics of the participants. Am. J. Epidemiol. 126 310–318.
  • Li, K.-C. (1986). Asymptotic optimality of $C_L$ and generalized cross-validation in ridge regression with application to spline smoothing. Ann. Statist. 14 1101–1112.
  • Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22.
  • Lin, X. and Carroll, R. J. (2000). Nonparametric function estimation for clustered data when the predictor is measured without/with error. J. Amer. Statist. Assoc. 95 520–534.
  • Lin, D. Y. and Ying, Z. (2001). Semiparametric and nonparametric regression analysis of longitudinal data. J. Amer. Statist. Assoc. 96 103–126.
  • Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233–243.
  • Wang, Y. (1998). Mixed effects smoothing spline analysis of variance. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 159–174.
  • Wang, N. (2003). Marginal nonparametric kernel regression accounting for within-subject correlation. Biometrika 90 43–52.
  • Wang, N., Carroll, R. J. and Lin, X. (2005). Efficient semiparametric marginal estimation for longitudinal/clustered data. J. Amer. Statist. Assoc. 100 147–157.
  • Wang, L., Li, H. and Huang, J. Z. (2008). Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J. Amer. Statist. Assoc. 103 1556–1569.
  • Welsh, A. H., Lin, X. and Carroll, R. J. (2002). Marginal longitudinal nonparametric regression: Locality and efficiency of spline and kernel methods. J. Amer. Statist. Assoc. 97 482–493.
  • Wood, S. N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Assoc. 99 673–686.
  • Wood, S. N. (2006). Generalized Additive Models: An Introduction with $R$. Chapman & Hall/CRC, Boca Raton, FL.
  • Wu, C. O. and Chiang, C.-T. (2000). Kernel smoothing on varying coefficient models with longitudinal dependent variable. Statist. Sinica 10 433–456.
  • Wu, H. and Zhang, J.-T. (2006). Nonparametric Regression Methods for Longitudinal Data Analysis. Wiley, Hoboken, NJ.
  • Xu, G. and Huang, J. Z. (2012). Supplement to “Asymptotic optimality and efficient computation of the leave-subject-out cross-validation.” DOI:10.1214/12-AOS1063SUPP.
  • Zeger, S. L. and Diggle, P. J. (1994). Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics 50 689–699.
  • Zhang, D., Lin, X., Raz, J. and Sowers, M. (1998). Semiparametric stochastic mixed models for longitudinal data. J. Amer. Statist. Assoc. 93 710–719.
  • Zhu, Z., Fung, W. K. and He, X. (2008). On the asymptotics of marginal regression splines with longitudinal data. Biometrika 95 907–917.

Supplemental materials

  • Supplementary material: Efficient algorithm and additional proofs. In the Supplementary Material, we give a detailed description of the algorithm proposed in Section 3.2. In addition, proofs of some technical lemmas are also included.