Open Access
2011 The Lasso as an 1-ball model selection procedure
Pascal Massart, Caroline Meynet
Electron. J. Statist. 5: 669-687 (2011). DOI: 10.1214/11-EJS623

Abstract

While many efforts have been made to prove that the Lasso behaves like a variable selection procedure at the price of strong (though unavoidable) assumptions on the geometric structure of these variables, much less attention has been paid to the oracle inequalities for the Lasso involving the 1-norm of the target vector. Such inequalities proved in the literature show that, provided that the regularization parameter is properly chosen, the Lasso approximately mimics the deterministic Lasso. Some of them do not require any assumption at all, neither on the structure of the variables nor on the regression function. Our first purpose here is to provide a conceptually very simple result in this direction in the framework of Gaussian models with non-random regressors.

Our second purpose is to propose a new estimator particularly adapted to deal with infinite countable dictionaries. This estimator is constructed as an 0-penalized estimator among a sequence of Lasso estimators associated to a dyadic sequence of growing truncated dictionaries. The selection procedure is choosing automatically the best level of truncation of the dictionary so as to make the best tradeoff between approximation, 1-regularization and sparsity. From a theoretical point of view, we shall provide an oracle inequality satisfied by this selected Lasso estimator.

The oracle inequalities presented in this paper are obtained via the application of a general theorem of model selection among a collection of nonlinear models which is a direct consequence of the Gaussian concentration inequality. The key idea that enables us to apply this general theorem is to see 1-regularization as a model selection procedure among 1-balls.

Citation

Download Citation

Pascal Massart. Caroline Meynet. "The Lasso as an 1-ball model selection procedure." Electron. J. Statist. 5 669 - 687, 2011. https://doi.org/10.1214/11-EJS623

Information

Published: 2011
First available in Project Euclid: 25 July 2011

zbMATH: 1274.62468
MathSciNet: MR2820635
Digital Object Identifier: 10.1214/11-EJS623

Keywords: generalized linear Gaussian model , ℓ_1-balls , ℓ_1-oracle inequalities , Lasso , model selection by penalization

Rights: Copyright © 2011 The Institute of Mathematical Statistics and the Bernoulli Society

Back to Top