Annals of Applied Statistics

Feature screening for time-varying coefficient models with ultrahigh-dimensional longitudinal data

Wanghuan Chu, Runze Li, and Matthew Reimherr

Full-text: Open access

Abstract

Motivated by an empirical analysis of the Childhood Asthma Management Project, CAMP, we introduce a new screening procedure for varying coefficient models with ultrahigh-dimensional longitudinal predictor variables. The performance of the proposed procedure is investigated via Monte Carlo simulation. Numerical comparisons indicate that it outperforms existing ones substantially, resulting in significant improvements in explained variability and prediction error. Applying these methods to CAMP, we are able to find a number of potentially important genetic mutations related to lung function, several of which exhibit interesting nonlinear patterns around puberty.

Article information

Source
Ann. Appl. Stat., Volume 10, Number 2 (2016), 596-617.

Dates
Received: January 2016
First available in Project Euclid: 22 July 2016

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1469199886

Digital Object Identifier
doi:10.1214/16-AOAS912

Mathematical Reviews number (MathSciNet)
MR3528353

Zentralblatt MATH identifier
06625662

Keywords
Feature selection time-varying coefficient models ultrahigh-dimensional longitudinal data genome-wide association study functional linear model

Citation

Chu, Wanghuan; Li, Runze; Reimherr, Matthew. Feature screening for time-varying coefficient models with ultrahigh-dimensional longitudinal data. Ann. Appl. Stat. 10 (2016), no. 2, 596--617. doi:10.1214/16-AOAS912. https://projecteuclid.org/euclid.aoas/1469199886


Export citation

References

  • Chu, W., Li, R. and Reimherr, M. (2016). Supplement to “Feature screening for time-varying coefficient models with ultrahigh-dimensional longitudinal data.” DOI:10.1214/16-AOAS912SUPP.
  • Fan, J., Feng, Y. and Song, R. (2011). Nonparametric independence screening in sparse ultra-high-dimensional additive models. J. Amer. Statist. Assoc. 106 544–557.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 849–911.
  • Fan, J., Ma, Y. and Dai, W. (2014). Nonparametric independence screening in sparse ultra-high-dimensional varying coefficient models. J. Amer. Statist. Assoc. 109 1270–1284.
  • Fan, J. and Song, R. (2010). Sure independence screening in generalized linear models with NP-dimensionality. Ann. Statist. 38 3567–3604.
  • He, X., Wang, L. and Hong, H. G. (2013). Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data. Ann. Statist. 41 342–369.
  • Huang, J. Z., Wu, C. O. and Zhou, L. (2004). Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Statist. Sinica 14 763–788.
  • Li, R., Zhong, W. and Zhu, L. (2012). Feature screening via distance correlation learning. J. Amer. Statist. Assoc. 107 1129–1139.
  • Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22.
  • Liu, J., Li, R. and Wu, R. (2014). Feature selection for varying coefficient models with ultrahigh-dimensional covariates. J. Amer. Statist. Assoc. 109 266–274.
  • Reimherr, M. and Nicolae, D. (2014). A functional data analysis approach for genetic association studies. Ann. Appl. Stat. 8 406–429.
  • Song, R., Yi, F. and Zou, H. (2014). On varying-coefficient independence screening for high-dimensional varying-coefficient models. Statist. Sinica 24 1735–1752.
  • The Childhood Asthma Management Program Research Group (1999). The Childhood Asthma Management Program (CAMP): Design, rationale, and methods. Control. Clin. Trials 20 91–120.
  • The Childhood Asthma Management Program Research Group (2000). Long-term effects of budesonide or nedocromil in children with asthma. N. Engl. J. Med. 343 1054–1063.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Zhu, L.-P., Li, L., Li, R. and Zhu, L.-X. (2011). Model-free feature screening for ultrahigh-dimensional data. J. Amer. Statist. Assoc. 106 1464–1475.

Supplemental materials

  • Supplement to “Feature screening for time-varying coefficient models with ultrahigh-dimensional longitudinal data”. Theoretical property with technical proofs and additional simulation results for $p=5000$ and $10{,}000$ are given in the online supplement.