Open Access
August 2014 Feature selection when there are many influential features
Peter Hall, Jiashun Jin, Hugh Miller
Bernoulli 20(3): 1647-1671 (August 2014). DOI: 10.3150/13-BEJ536

Abstract

Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands or tens of thousands, rather than (as is commonly supposed at present) approximately in the range from five to fifty. This change, in orders of magnitude, in the number of influential features, necessitates alterations to the way in which we choose features and to the manner in which the success of feature selection is assessed. In this paper, we suggest a general approach that is suited to cases where the number of relevant features is very large, and we consider particular versions of the approach in detail. We propose ways of measuring performance, and we study both theoretical and numerical properties of the proposed methodology.

Citation

Download Citation

Peter Hall. Jiashun Jin. Hugh Miller. "Feature selection when there are many influential features." Bernoulli 20 (3) 1647 - 1671, August 2014. https://doi.org/10.3150/13-BEJ536

Information

Published: August 2014
First available in Project Euclid: 11 June 2014

zbMATH: 06327922
MathSciNet: MR3217457
Digital Object Identifier: 10.3150/13-BEJ536

Keywords: change-point analysis , ‎classification‎ , Dimension reduction , Feature selection , logit model , maximum likelihood , ranking , thresholding

Rights: Copyright © 2014 Bernoulli Society for Mathematical Statistics and Probability

Vol.20 • No. 3 • August 2014
Back to Top