Open Access
December 2010 Sure independence screening in generalized linear models with NP-dimensionality
Jianqing Fan, Rui Song
Ann. Statist. 38(6): 3567-3604 (December 2010). DOI: 10.1214/10-AOS798

Abstract

Ultrahigh-dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. Among others, Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849–911] propose an independent screening framework by ranking the marginal correlations. They showed that the correlation ranking procedure possesses a sure independence screening property within the context of the linear model with Gaussian covariates and responses. In this paper, we propose a more general version of the independent learning with ranking the maximum marginal likelihood estimates or the maximum marginal likelihood itself in generalized linear models. We show that the proposed methods, with Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849–911] as a very special case, also possess the sure screening property with vanishing false selection rate. The conditions under which the independence learning possesses a sure screening is surprisingly simple. This justifies the applicability of such a simple method in a wide spectrum. We quantify explicitly the extent to which the dimensionality can be reduced by independence screening, which depends on the interactions of the covariance matrix of covariates and true parameters. Simulation studies are used to illustrate the utility of the proposed approaches. In addition, we establish an exponential inequality for the quasi-maximum likelihood estimator which is useful for high-dimensional statistical learning.

Citation

Download Citation

Jianqing Fan. Rui Song. "Sure independence screening in generalized linear models with NP-dimensionality." Ann. Statist. 38 (6) 3567 - 3604, December 2010. https://doi.org/10.1214/10-AOS798

Information

Published: December 2010
First available in Project Euclid: 30 November 2010

zbMATH: 1206.68157
MathSciNet: MR2766861
Digital Object Identifier: 10.1214/10-AOS798

Subjects:
Primary: 62J12 , 68Q32
Secondary: 60F10 , 62E99

Keywords: generalized linear models , independent learning , sure independent screening , Variable selection

Rights: Copyright © 2010 Institute of Mathematical Statistics

Vol.38 • No. 6 • December 2010
Back to Top