The Lasso with general Gaussian designs with applications to hypothesis testing

Michael Celentano; Andrea Montanari; Yuting Wei

doi:10.1214/23-AOS2327

Abstract

The Lasso is a method for high-dimensional regression, which is now commonly used when the number of covariates p is of the same order or larger than the number of observations n. Classical asymptotic normality theory does not apply to this model due to two fundamental reasons: $(1)$ The regularized risk is nonsmooth; $(2)$ The distance between the estimator $\hat{θ}$ and the true parameters vector $θ^{*}$ cannot be neglected. As a consequence, standard perturbative arguments that are the traditional basis for asymptotic normality fail.

On the other hand, the Lasso estimator can be precisely characterized in the regime in which both n and p are large and $n / p$ is of order one. This characterization was first obtained in the case of Gaussian designs with i.i.d. covariates: here we generalize it to Gaussian correlated designs with non-singular covariance structure. This is expressed in terms of a simpler “fixed-design” model. We establish nonasymptotic bounds on the distance between the distribution of various quantities in the two models, which hold uniformly over signals $θ^{*}$ in a suitable sparsity class and over values of the regularization parameter.

As an application, we study the distribution of the debiased Lasso and show that a degrees-of-freedom correction is necessary for computing valid confidence intervals.

Funding Statement

The first author was partially supported by NSF Grants CCF-1714305, IIS-1741162 and ONR Grant N00014-18-1-2729. We thank the anonymous reviewers for their valuable reviews.
The second author was partially supported by the National Science Foundation Graduate Research Fellowship Grant DGE-1656518.
The third author was partially supported by NSF Grants DMS-2015447/2147546, CAREER award DMS-2143215 and the Google Research Scholar Award.

Citation

Download Citation

Michael Celentano. Andrea Montanari. Yuting Wei. "The Lasso with general Gaussian designs with applications to hypothesis testing." Ann. Statist. 51 (5) 2194 - 2220, October 2023. https://doi.org/10.1214/23-AOS2327

Information

Received: 1 July 2022; Revised: 1 June 2023; Published: October 2023

First available in Project Euclid: 14 December 2023

Digital Object Identifier: 10.1214/23-AOS2327

Subjects:

Primary: 62E17 , 62J07

Secondary: 62F05 , 62F12

Keywords: convex Gaussian min–max theorem , debiased Lasso , exact asymptotics , Gaussian designs , Gaussian width , Lasso

Abstract

Funding Statement

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS