Open Access
2024 Benign overfitting of non-sparse high-dimensional linear regression with correlated noise
Toshiki Tsuda, Masaaki Imaizumi
Author Affiliations +
Electron. J. Statist. 18(2): 4119-4197 (2024). DOI: 10.1214/24-EJS2297

Abstract

We investigate the high-dimensional linear regression problem in the presence of noise that is correlated with Gaussian covariates. This type of correlation, known as endogeneity in regression models, often results from unobserved variables and other factors. It poses a significant challenge in causal inference and econometrics. In cases where covariates are high-dimensional, it is common to assume sparsity in the true parameters and to estimate them using regularization techniques, even with endogeneity. However, when sparsity is not applicable, controlling both endogeneity and high dimensionality simultaneously has not been well understood. This study demonstrates that an estimator, even without regularization, can achieve consistency, or benign overfitting, under certain assumptions about the covariance matrix. Specifically, our results indicate that the error of this estimator converges to zero when the covariance matrices of the correlated noise and the instrumental variables meet specific conditions related to their eigenvalues. We explore several extensions that relax these conditions and conduct experiments to validate our theoretical findings. As a technical contribution, we employ the convex Gaussian minimax theorem (CGMT) in our dual problem and expand upon CGMT itself.

Funding Statement

T.Tsuda was supported by JSPS Grant-in-Aid for JSPS Research Fellows (23KJ0713). M.Imaizumi was supported by JSPS KAKENHI (21K11780), JST CREST (JPMJCR21D2), and JST FOREST (JPMJFR216I).

Citation

Download Citation

Toshiki Tsuda. Masaaki Imaizumi. "Benign overfitting of non-sparse high-dimensional linear regression with correlated noise." Electron. J. Statist. 18 (2) 4119 - 4197, 2024. https://doi.org/10.1214/24-EJS2297

Information

Received: 1 April 2023; Published: 2024
First available in Project Euclid: 11 November 2024

Digital Object Identifier: 10.1214/24-EJS2297

Subjects:
Primary: 62J05

Keywords: convex Gaussian minimax theorem , endogeneity , high-dimension , Linear regression , non-sparsity

Vol.18 • No. 2 • 2024
Back to Top