June 2021 On cross-validated Lasso in high dimensions
Denis Chetverikov, Zhipeng Liao, Victor Chernozhukov
Author Affiliations +
Ann. Statist. 49(3): 1300-1317 (June 2021). DOI: 10.1214/20-AOS2000

Abstract

In this paper, we derive nonasymptotic error bounds for the Lasso estimator when the penalty parameter for the estimator is chosen using K-fold cross-validation. Our bounds imply that the cross-validated Lasso estimator has nearly optimal rates of convergence in the prediction, L2, and L1 norms. For example, we show that in the model with the Gaussian noise and under fairly general assumptions on the candidate set of values of the penalty parameter, the estimation error of the cross-validated Lasso estimator converges to zero in the prediction norm with the slogp/n×log(pn) rate, where n is the sample size of available data, p is the number of covariates and s is the number of nonzero coefficients in the model. Thus, the cross-validated Lasso estimator achieves the fastest possible rate of convergence in the prediction norm up to a small logarithmic factor log(pn), and similar conclusions apply for the convergence rate both in L2 and in L1 norms. Importantly, our results cover the case when p is (potentially much) larger than n and also allow for the case of non-Gaussian noise. Our paper therefore serves as a justification for the widely spread practice of using cross-validation as a method to choose the penalty parameter for the Lasso estimator.

Funding Statement

Chetverikov’s work was partially funded by NSF Grant SES–1628889. Liao’s work was partially funded by NSF Grant SES–1628889.

Acknowledgments

We thank Mehmet Caner, Matias Cattaneo, Yanqin Fan, Sara van de Geer, Jerry Hausman, James Heckman, Roger Koenker, Andzhey Koziuk, Miles Lopes, Jinchi Lv, Rosa Matzkin, Anna Mikusheva, Whitney Newey, Jesper Sorensen, Vladimir Spokoiny, Larry Wasserman and seminar participants in many places for helpful comments.

Citation

Download Citation

Denis Chetverikov. Zhipeng Liao. Victor Chernozhukov. "On cross-validated Lasso in high dimensions." Ann. Statist. 49 (3) 1300 - 1317, June 2021. https://doi.org/10.1214/20-AOS2000

Information

Received: 1 February 2019; Revised: 1 July 2020; Published: June 2021
First available in Project Euclid: 9 August 2021

MathSciNet: MR4298865
zbMATH: 1475.62209
Digital Object Identifier: 10.1214/20-AOS2000

Subjects:
Primary: 62J07

Keywords: cross-validation , high-dimensional models , Lasso , nonasymptotic bounds , Sparsity

Rights: Copyright © 2021 Institute of Mathematical Statistics

Vol.49 • No. 3 • June 2021
Back to Top