August 2023 Universality of regularized regression estimators in high dimensions
Qiyang Han, Yandi Shen
Author Affiliations +
Ann. Statist. 51(4): 1799-1823 (August 2023). DOI: 10.1214/23-AOS2309

Abstract

The Convex Gaussian Min–Max Theorem (CGMT) has emerged as a prominent theoretical tool for analyzing the precise stochastic behavior of various statistical estimators in the so-called high-dimensional proportional regime, where the sample size and the signal dimension are of the same order. However, a well-recognized limitation of the existing CGMT machinery rests in its stringent requirement on the exact Gaussianity of the design matrix, therefore rendering the obtained precise high-dimensional asymptotics, largely a specific Gaussian theory in various important statistical models.

This paper provides a structural universality framework for a broad class of regularized regression estimators that is particularly compatible with the CGMT machinery. Here, universality means that if a “structure” is satisfied by the regression estimator μˆG for a standard Gaussian design G, then it will also be satisfied by μˆA for a general non-Gaussian design A with independent entries. In particular, we show that with a good enough bound for the regression estimator μˆA, any “structural property” that can be detected via the CGMT for μˆG also holds for μˆA under a general design A with independent entries.

As a proof of concept, we demonstrate our new universality framework in three key examples of regularized regression estimators: the Ridge, Lasso and regularized robust regression estimators, where new universality properties of risk asymptotics and/or distributions of regression estimators and other related quantities are proved. As a major statistical implication of the Lasso universality results, we validate inference procedures using the degrees-of-freedom adjusted debiased Lasso under general design and error distributions. We also provide a counterexample, showing that universality properties for regularized regression estimators do not extend to general isotropic designs.

The proof of our universality results relies on new comparison inequalities for the optimum of a broad class of cost functions and Gordon’s max–min (or min–max) costs, over arbitrary structure sets subject to constraints. These results may be of independent interest and broader applicability.

Funding Statement

The research of Q. Han is partially supported by NSF Grants DMS-1916221 and DMS-2143468.

Acknowledgments

The authors are indebted to Cun-Hui Zhang for a number of stimulating discussions during various stages of this research. The authors also thank three referees, an Associate Editor and the Editor for helpful comments and suggestions that significantly improved the quality of the paper.

Citation

Download Citation

Qiyang Han. Yandi Shen. "Universality of regularized regression estimators in high dimensions." Ann. Statist. 51 (4) 1799 - 1823, August 2023. https://doi.org/10.1214/23-AOS2309

Information

Received: 1 June 2022; Revised: 1 March 2023; Published: August 2023
First available in Project Euclid: 19 October 2023

Digital Object Identifier: 10.1214/23-AOS2309

Subjects:
Primary: 60F17
Secondary: 62E17

Keywords: Gaussian comparison inequalities , high-dimensional asymptotics , Lasso , Lindeberg’s principle , Random matrix theory , Ridge regression , robust regression , Universality

Rights: Copyright © 2023 Institute of Mathematical Statistics

JOURNAL ARTICLE
25 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.51 • No. 4 • August 2023
Back to Top