## Annals of Statistics

### Slope meets Lasso: Improved oracle bounds and optimality

#### Abstract

We show that two polynomial time methods, a Lasso estimator with adaptively chosen tuning parameter and a Slope estimator, adaptively achieve the minimax prediction and $\ell_{2}$ estimation rate $(s/n)\log(p/s)$ in high-dimensional linear regression on the class of $s$-sparse vectors in $\mathbb{R}^{p}$. This is done under the Restricted Eigenvalue (RE) condition for the Lasso and under a slightly more constraining assumption on the design for the Slope. The main results have the form of sharp oracle inequalities accounting for the model misspecification error. The minimax optimal bounds are also obtained for the $\ell_{q}$ estimation errors with $1\le q\le2$ when the model is well specified. The results are nonasymptotic, and hold both in probability and in expectation. The assumptions that we impose on the design are satisfied with high probability for a large class of random matrices with independent and possibly anisotropically distributed rows. We give a comparative analysis of conditions, under which oracle bounds for the Lasso and Slope estimators can be obtained. In particular, we show that several known conditions, such as the RE condition and the sparse eigenvalue condition are equivalent if the $\ell_{2}$-norms of regressors are uniformly bounded.

#### Article information

Source
Ann. Statist., Volume 46, Number 6B (2018), 3603-3642.

Dates
Revised: May 2017
First available in Project Euclid: 11 September 2018

https://projecteuclid.org/euclid.aos/1536631285

Digital Object Identifier
doi:10.1214/17-AOS1670

Mathematical Reviews number (MathSciNet)
MR3852663

Zentralblatt MATH identifier
1405.62056

#### Citation

Bellec, Pierre C.; Lecué, Guillaume; Tsybakov, Alexandre B. Slope meets Lasso: Improved oracle bounds and optimality. Ann. Statist. 46 (2018), no. 6B, 3603--3642. doi:10.1214/17-AOS1670. https://projecteuclid.org/euclid.aos/1536631285

#### References

• [1] Abramovich, F. and Grinshtein, V. (2010). MAP model selection in Gaussian regression. Electron. J. Stat. 4 932–949.
• [2] Abramowitz, M. and Stegun, I. A. (1964). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards Applied Mathematics Series 55. Washington, DC.
• [3] Berthet, Q. and Rigollet, P. (2013). Optimal detection of sparse principal components in high dimension. Ann. Statist. 41 1780–1815.
• [4] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
• [5] Bogdan, M., van den Berg, E., Sabatti, C., Su, W. and Candès, E. J. (2015). SLOPE—Adaptive variable selection via convex optimization. Ann. Appl. Stat. 9 1103–1140.
• [6] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford Univ. Press, London.
• [7] Candès, E. J. and Davenport, M. A. (2013). How well can we estimate a sparse vector? Appl. Comput. Harmon. Anal. 34 317–323.
• [8] Candes, E. J. and Tao, T. (2006). Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans. Inform. Theory 52 5406–5425.
• [9] Chafaï, D., Guédon, O., Lecué, G. and Pajor, A. (2012). Interactions Between Compressed Sensing Random Matrices and High Dimensional Geometry. Panoramas et Synthèses 37. Société Mathématique de France, Paris.
• [10] Dalalyan, A. S., Hebiri, M., Lederer, J. et al. (2017). On the prediction performance of the Lasso. Bernoulli 23 552–581.
• [11] Dirksen, S. (2015). Tail bounds via generic chaining. Electron. J. Probab. 20 no. 53, 29.
• [12] Giraud, C. (2015). Introduction to High-Dimensional Statistics. Monographs on Statistics and Applied Probability. 139. CRC Press, Boca Raton, FL.
• [13] Koltchinskii, V. (2011). Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Ecole D’Ete de Probabilites de Saint-Flour XXXVIII-2008. Springer, New York.
• [14] Koltchinskii, V., Lounici, K. and Tsybakov, A. B. (2011). Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 39 2302–2329.
• [15] Koltchinskii, V. and Mendelson, S. (2015). Bounding the smallest singular value of a random matrix without concentration. Int. Math. Res. Not. IMRN 23 12991–13008.
• [16] Lecué, G. and Mendelson, S. (2015). Regularization and the small-ball method I: Sparse recovery Technical report, CNRS, ENSAE and Technion, I.I.T.
• [17] Lecué, G. and Mendelson, S. (2017). Sparse recovery under weak moment assumptions. J. Eur. Math. Soc. (JEMS) 19 881–904.
• [18] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces: Isoperimetry and Processes. Ergebnisse der Mathematik und Ihrer Grenzgebiete (3) 23. Springer, Berlin.
• [19] Lounici, K., Pontil, M., Tsybakov, A. B. and van de Geer, S. A. (2011). Oracle inequalities and optimal inference under group sparsity. Ann. Statist. 39 2164–2204.
• [20] Mendelson, S. (2014). Learning without concentration. In Proceedings of the 27th Annual Conference on Learning Theory COLT14 25–39.
• [21] Mendelson, S. (2016). Upper bounds on product and multiplier empirical processes. Stochastic Process. Appl. 126 3652–3680.
• [22] Mendelson, S. (2015). Learning without concentration. J. ACM 62 Art. 21, 25.
• [23] Raskutti, G., Wainwright, M. J. and Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over $\ell_{q}$-balls. IEEE Trans. Inform. Theory 57 6976–6994.
• [24] Rigollet, P. and Tsybakov, A. (2011). Exponential screening and optimal rates of sparse estimation. Ann. Statist. 39 731–771.
• [25] Su, W. and Candès, E. (2016). SLOPE is adaptive to unknown sparsity and asymptotically minimax. Ann. Statist. 44 1038–1068.
• [26] Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York.
• [27] Verzelen, N. (2012). Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. Electron. J. Stat. 6 38–90.
• [28] Witold, B. (2013). Concentration via chaining method and its applications. Technical report, Univ. Warsaw. Available at arXiv:1405.0676.
• [29] Ye, F. and Zhang, C.-H. (2010). Rate minimaxity of the Lasso and Dantzig selector for the $\ell_{q}$ loss in $\ell_{r}$ balls. J. Mach. Learn. Res. 11 3519–3540.