The Annals of Applied Statistics

Improved variable selection with Forward-Lasso adaptive shrinkage

Peter Radchenko and Gareth M. James

Full-text: Open access

Abstract

Recently, considerable interest has focused on variable selection methods in regression situations where the number of predictors, p, is large relative to the number of observations, n. Two commonly applied variable selection approaches are the Lasso, which computes highly shrunk regression coefficients, and Forward Selection, which uses no shrinkage. We propose a new approach, “Forward-Lasso Adaptive SHrinkage” (FLASH), which includes the Lasso and Forward Selection as special cases, and can be used in both the linear regression and the Generalized Linear Model domains. As with the Lasso and Forward Selection, FLASH iteratively adds one variable to the model in a hierarchical fashion but, unlike these methods, at each step adjusts the level of shrinkage so as to optimize the selection of the next variable. We first present FLASH in the linear regression setting and show that it can be fitted using a variant of the computationally efficient LARS algorithm. Then, we extend FLASH to the GLM domain and demonstrate, through numerous simulations and real world data sets, as well as some theoretical analysis, that FLASH generally outperforms many competing approaches.

Article information

Source
Ann. Appl. Stat., Volume 5, Number 1 (2011), 427-448.

Dates
First available in Project Euclid: 21 March 2011

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1300715197

Digital Object Identifier
doi:10.1214/10-AOAS375

Mathematical Reviews number (MathSciNet)
MR2810404

Zentralblatt MATH identifier
1220.62089

Keywords
Forward Selection Lasso shrinkage variable selection

Citation

Radchenko, Peter; James, Gareth M. Improved variable selection with Forward-Lasso adaptive shrinkage. Ann. Appl. Stat. 5 (2011), no. 1, 427--448. doi:10.1214/10-AOAS375. https://projecteuclid.org/euclid.aoas/1300715197


Export citation

References

  • Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n (with discussion). Ann. Statist. 35 2313–2351.
  • Efron, B., Hastie, T., Johnston, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407–451.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Frank, I. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics 35 109–135.
  • Friedman, J., Hastie, T., Hoefling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization. Ann. Appl. Statist. 1 302–332.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33 1.
  • Huang, S., Ma, S. and Zhang, C.-H. (2008). Adaptive lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603–1618.
  • Hwang, W., Zhang, H. and Ghosal, S. (2009). First: Combining forward iterative selection and shrinkage in high dimensional sparse linear regression. Stat. Interface 2 341–348.
  • James, G. M. and Radchenko, P. (2009). A generalized Dantzig selector with shrinkage tuning. Biometrika 96 323–337.
  • Meinshausen, N. (2007). Relaxed lasso. Comput. Statist. Data Anal. 52 374–393.
  • Park, M. and Hastie, T. (2007). An L1 regularization-path algorithm for generalized linear models. J. Roy. Statist. Soc. Ser. B 69 659–677.
  • Radchenko, P. and James, G. M. (2008). Variable inclusion and shrinkage algorithms. J. Amer. Statist. Assoc. 103 1304–1315.
  • Radchenko, P. and James, G. M. (2010). Supplement to “Forward-LASSO adaptive shrinkage.” DOI: 10.1214/10-AOAS375SUPP.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparcity recovery using 1-constrained quadratic programming (lasso). IEEE Trans. Inform. Theory 55 2183–2202.
  • Zhao, P. and Yu, B. (2006). On model selection consistency of lasso. J. Mach. Learn. Res. 7 2541–2563.
  • Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. Ser. B 67 301–320.

Supplemental materials