Bernoulli

  • Bernoulli
  • Volume 16, Number 4 (2010), 1369-1384.

Consistent group selection in high-dimensional linear regression

Fengrong Wei and Jian Huang

Full-text: Open access

Abstract

In regression problems where covariates can be naturally grouped, the group Lasso is an attractive method for variable selection since it respects the grouping structure in the data. We study the selection and estimation properties of the group Lasso in high-dimensional settings when the number of groups exceeds the sample size. We provide sufficient conditions under which the group Lasso selects a model whose dimension is comparable with the underlying model with high probability and is estimation consistent. However, the group Lasso is, in general, not selection consistent and also tends to select groups that are not important in the model. To improve the selection results, we propose an adaptive group Lasso method which is a generalization of the adaptive Lasso and requires an initial estimator. We show that the adaptive group Lasso is consistent in group selection under certain conditions if the group Lasso is used as the initial estimator.

Article information

Source
Bernoulli, Volume 16, Number 4 (2010), 1369-1384.

Dates
First available in Project Euclid: 18 November 2010

Permanent link to this document
https://projecteuclid.org/euclid.bj/1290092910

Digital Object Identifier
doi:10.3150/10-BEJ252

Mathematical Reviews number (MathSciNet)
MR2759183

Zentralblatt MATH identifier
1207.62146

Keywords
group selection high-dimensional data penalized regression rate consistency selection consistency

Citation

Wei, Fengrong; Huang, Jian. Consistent group selection in high-dimensional linear regression. Bernoulli 16 (2010), no. 4, 1369--1384. doi:10.3150/10-BEJ252. https://projecteuclid.org/euclid.bj/1290092910


Export citation

References

  • Antoniadis, A. and Fan, J. (2001). Regularization of wavelet approximation (with discussion). J. Amer. Statist. Assoc. 96 939–967.
  • Bühlmann, P. and Meier, L. (2008). Discussion of “One-step sparse estimates in nonconcave penalized likelihood models,” by H. Zou and R. Li. Ann. Statist. 36 1534–1541.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928–961.
  • Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 10 971–988.
  • Huang, J., Horowitz, J.L. and Ma, S.G. (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Statist. 36 587–613.
  • Huang, J., Ma, S. and Zhang, C.H. (2006). Adaptive lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603–1618.
  • Kim, Y., Kim, J. and Kim, Y. (2006). The blockwise sparse regression. Statist. Sinica 16 375–390.
  • Knight, K. and Fu, W.J. (2001). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356–1378.
  • Meier, L., van de Geer, S. and Bühlmann, P. (2008). Group Lasso for logisitc regression. J. R. Stat. Soc. Ser. B 70 53–71.
  • Meinshausen, N. and Buhlmann, P. (2006). High dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 1436–1462.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58 267–288.
  • van de Geer, S. (2008). High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 614–645.
  • Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68 49–67.
  • Zhang, C.H. (2007). Penalized linear unbiased selection. Technical Report 2007-003, Dept. Statistics, Rutgers Univ.
  • Zhang, C.H. and Huang, J. (2008). Model-selection consistency of the LASSO in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • Zhao, P., Rocha, G. and Yu, B. (2008). Grouped and hierarchical model selection through composite absolute penalties. Ann. Statist. 36 1567–1594.
  • Zhao, P. and Yu, B. (2006). On model selection consistency of LASSO. J. Mach. Learn. Res. 7 2541–2563.
  • Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
  • Zou, H. and Hastie, T. (2006). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 67 301–320.