Abstract
The Convex Gaussian Min–Max Theorem (CGMT) has emerged as a prominent theoretical tool for analyzing the precise stochastic behavior of various statistical estimators in the so-called high-dimensional proportional regime, where the sample size and the signal dimension are of the same order. However, a well-recognized limitation of the existing CGMT machinery rests in its stringent requirement on the exact Gaussianity of the design matrix, therefore rendering the obtained precise high-dimensional asymptotics, largely a specific Gaussian theory in various important statistical models.
This paper provides a structural universality framework for a broad class of regularized regression estimators that is particularly compatible with the CGMT machinery. Here, universality means that if a “structure” is satisfied by the regression estimator for a standard Gaussian design G, then it will also be satisfied by for a general non-Gaussian design A with independent entries. In particular, we show that with a good enough bound for the regression estimator , any “structural property” that can be detected via the CGMT for also holds for under a general design A with independent entries.
As a proof of concept, we demonstrate our new universality framework in three key examples of regularized regression estimators: the Ridge, Lasso and regularized robust regression estimators, where new universality properties of risk asymptotics and/or distributions of regression estimators and other related quantities are proved. As a major statistical implication of the Lasso universality results, we validate inference procedures using the degrees-of-freedom adjusted debiased Lasso under general design and error distributions. We also provide a counterexample, showing that universality properties for regularized regression estimators do not extend to general isotropic designs.
The proof of our universality results relies on new comparison inequalities for the optimum of a broad class of cost functions and Gordon’s max–min (or min–max) costs, over arbitrary structure sets subject to constraints. These results may be of independent interest and broader applicability.
Funding Statement
The research of Q. Han is partially supported by NSF Grants DMS-1916221 and DMS-2143468.
Acknowledgments
The authors are indebted to Cun-Hui Zhang for a number of stimulating discussions during various stages of this research. The authors also thank three referees, an Associate Editor and the Editor for helpful comments and suggestions that significantly improved the quality of the paper.
Citation
Qiyang Han. Yandi Shen. "Universality of regularized regression estimators in high dimensions." Ann. Statist. 51 (4) 1799 - 1823, August 2023. https://doi.org/10.1214/23-AOS2309
Information