Electronic Journal of Statistics

On the conditions used to prove oracle results for the Lasso

Sara A. van de Geer and Peter Bühlmann

Full-text: Open access

Abstract

Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue condition [2] or the slightly weaker compatibility condition [18] are sufficient for oracle results. We argue that both these conditions allow for a fairly general class of design matrices. Hence, optimality of the Lasso for prediction and estimation holds for more general situations than what it appears from coherence [5, 4] or restricted isometry [10] assumptions.

Article information

Source
Electron. J. Statist. Volume 3 (2009), 1360-1392.

Dates
First available in Project Euclid: 14 December 2009

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1260801227

Digital Object Identifier
doi:10.1214/09-EJS506

Mathematical Reviews number (MathSciNet)
MR2576316

Zentralblatt MATH identifier
1327.62425

Subjects
Primary: 62C05: General considerations 62G05: Estimation
Secondary: 62J07: Ridge regression; shrinkage estimators 94A12: Signal theory (characterization, reconstruction, filtering, etc.)

Keywords
Coherence compatibility irrepresentable condition Lasso restricted eigenvalue restricted isometry sparsity

Citation

van de Geer, Sara A.; Bühlmann, Peter. On the conditions used to prove oracle results for the Lasso. Electron. J. Statist. 3 (2009), 1360--1392. doi:10.1214/09-EJS506. https://projecteuclid.org/euclid.ejs/1260801227


Export citation

References

  • [1] Bertsimas, and Tsitsiklis, (1997)., Introduction to linear optimization. Athena Scientific Belmont, MA.
  • [2] Bickel, Ritov, and Tsybakov, (2009). Simultaneous analysis of Lasso and Dantzig selector., Annals of Statistics 37 1705–1732.
  • [3] Bunea, Tsybakov, and Wegkamp, (2007a). Aggregation for Gaussian regression., Annals of Statistics 35 1674.
  • [4] Bunea, Tsybakov, and Wegkamp, (2007c). Sparsity oracle inequalities for the Lasso., Electronic Journal of Statistics 1 169–194.
  • [5] Bunea, Tsybakov, and Wegkamp, (2007b). Sparse Density Estimation with, 1 Penalties. In Learning Theory 20th Annual Conference on Learning Theory, COLT 2007, San Diego, CA, USA, June 13-15, 2007: Proceedings 530. Springer.
  • [6] Cai, Wang, and Xu, (2009a). Shifting inequality and recovery of sparse signals., IEEE Transactions on Signal Processing, to appear.
  • [7] Cai, Wang, and Xu, (2009b). Stable recovery of sparse signals and an oracle inequality., Preprint.
  • [8] Cai, Xu, and Zhang, (2009). On recovery of sparse signals via, 1 minimization. IEEE Transactions on Information Theory 55 3388–3397.
  • [9] Candès, and Plan, (2009). Near-ideal model selection by, 1 minimization. Annals of Statistics 37 2145–2177.
  • [10] Candès, and Tao, (2005). Decoding by linear programming., IEEE Transactions on Information Theory 51 4203–4215.
  • [11] Candès, and Tao, (2007). The Dantzig selector: statistical estimation when p is much larger than n., Annals of Statistics 35 2313–2351.
  • [12] Koltchinskii, (2009a). Sparsity in penalized empirical risk minimization., Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 45 7–57.
  • [13] Koltchinskii, (2009b). The Dantzig selector and sparsity oracle inequalities., Bernoulli 15 799–828.
  • [14] Lounici, (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators., Electronic Journal of Statistics 2 90–102.
  • [15] Meinshausen, and Bühlmann, (2006). High-dimensional graphs and variable selection with the Lasso., Annals of Statistics 34 1436–1462.
  • [16] Meinshausen, and Yu, (2009). Lasso-type recovery of sparse representations for high-dimensional data., Annals of Statistics 37 246–270.
  • [17] Parter, (1961). Extreme eigenvalues of Toeplitz forms and applications to elliptic difference equations., Transactions of the American Mathematical Society 99 153–192.
  • [18] van de Geer, (2007). The deterministic Lasso. In, JSM proceedings, (see also http://stat.ethz.ch/research/research_reports/2007/140). American Statistical Association.
  • [19] van de Geer, (2008). High-dimensional generalized linear models and the Lasso., Annals of Statistics 36 614–645.
  • [20] Wainwright, (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using, 1-constrained quadratic programming (Lasso). IEEE Transactions on Information Theory 55 2183–2202.
  • [21] Zhang, and Huang, (2008). The sparsity and bias of the Lasso selection in high-dimensional linear regression., Annals of Statistics 36 1567–1594.
  • [22] Zhang, (2009). Some sharp performance bounds for least squares regression with L1 regularization., Annals of Statistics 37 2109–2144.
  • [23] Zhao, and Yu, (2006). On model selection consistency of Lasso., Journal of Machine Learning Research 7 2541–2563.
  • [24] Zou, (2006). The adaptive Lasso and its oracle properties., Journal of the American Statistical Association 101 1418–1429.