Open Access
October 2014 Nonparametric independence screening and structure identification for ultra-high dimensional longitudinal data
Ming-Yen Cheng, Toshio Honda, Jialiang Li, Heng Peng
Ann. Statist. 42(5): 1819-1849 (October 2014). DOI: 10.1214/14-AOS1236

Abstract

Ultra-high dimensional longitudinal data are increasingly common and the analysis is challenging both theoretically and methodologically. We offer a new automatic procedure for finding a sparse semivarying coefficient model, which is widely accepted for longitudinal data analysis. Our proposed method first reduces the number of covariates to a moderate order by employing a screening procedure, and then identifies both the varying and constant coefficients using a group SCAD estimator, which is subsequently refined by accounting for the within-subject correlation. The screening procedure is based on working independence and B-spline marginal models. Under weaker conditions than those in the literature, we show that with high probability only irrelevant variables will be screened out, and the number of selected variables can be bounded by a moderate order. This allows the desirable sparsity and oracle properties of the subsequent structure identification step. Note that existing methods require some kind of iterative screening in order to achieve this, thus they demand heavy computational effort and consistency is not guaranteed. The refined semivarying coefficient model employs profile least squares, local linear smoothing and nonparametric covariance estimation, and is semiparametric efficient. We also suggest ways to implement the proposed methods, and to select the tuning parameters. An extensive simulation study is summarized to demonstrate its finite sample performance and the yeast cell cycle data is analyzed.

Citation

Download Citation

Ming-Yen Cheng. Toshio Honda. Jialiang Li. Heng Peng. "Nonparametric independence screening and structure identification for ultra-high dimensional longitudinal data." Ann. Statist. 42 (5) 1819 - 1849, October 2014. https://doi.org/10.1214/14-AOS1236

Information

Published: October 2014
First available in Project Euclid: 11 September 2014

zbMATH: 1305.62169
MathSciNet: MR3262469
Digital Object Identifier: 10.1214/14-AOS1236

Subjects:
Primary: 62G08

Keywords: B-spline , Independence screening , longitudinal data , oracle property , SCAD , Sparsity

Rights: Copyright © 2014 Institute of Mathematical Statistics

Vol.42 • No. 5 • October 2014
Back to Top