Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data

Xu Han

doi:10.1214/18-AOS1738

August 2019 Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data

Xu Han

Ann. Statist. 47(4): 1995-2022 (August 2019). DOI: 10.1214/18-AOS1738

Abstract

Sure screening technique has been considered as a powerful tool to handle the ultrahigh dimensional variable selection problems, where the dimensionality $p$ and the sample size $n$ can satisfy the NP dimensionality $\log p=O(n^{a})$ for some $a>0$ [J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 (2008) 849–911]. The current paper aims to simultaneously tackle the “universality” and “effectiveness” of sure screening procedures. For the “universality,” we develop a general and unified framework for nonparametric screening methods from a loss function perspective. Consider a loss function to measure the divergence of the response variable and the underlying nonparametric function of covariates. We newly propose a class of loss functions called conditional strictly convex loss, which contains, but is not limited to, negative log likelihood loss from one-parameter exponential families, exponential loss for binary classification and quantile regression loss. The sure screening property and model selection size control will be established within this class of loss functions. For the “effectiveness,” we focus on a goodness-of-fit nonparametric screening (Goffins) method under conditional strictly convex loss. Interestingly, we can achieve a better convergence probability of containing the true model compared with related literature. The superior performance of our proposed method has been further demonstrated by extensive simulation studies and some real scientific data example.

Citation

Download Citation

Xu Han. "Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data." Ann. Statist. 47 (4) 1995 - 2022, August 2019. https://doi.org/10.1214/18-AOS1738

Information

Received: 1 June 2017; Revised: 1 June 2018; Published: August 2019

First available in Project Euclid: 21 May 2019

zbMATH: 07082277

MathSciNet: MR3953442

Digital Object Identifier: 10.1214/18-AOS1738

Subjects:

Primary: 62G99

Keywords: conditional strictly convex loss , goodness-of-fit nonparametric screening , sure screening property , Ultrahigh dimensional variable selection

Access the abstract

JOURNAL ARTICLE
28 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY