Open Access
2022 Dimension independent excess risk by stochastic gradient descent
Xi Chen, Qiang Liu, Xin T. Tong
Author Affiliations +
Electron. J. Statist. 16(2): 4547-4603 (2022). DOI: 10.1214/22-EJS2055

Abstract

One classical canon of statistics is that large models are prone to overfitting, and model selection procedures are necessary for high dimensional data. However, many overparameterized models, such as neural networks, perform very well in practice, although they are often trained with simple online methods and regularization. The empirical success of overparameterized models, which is often known as benign overfitting, motivates us to have a new look at the statistical generalization theory for online optimization. In particular, we present a general theory on the excess risk of stochastic gradient descent (SGD) solutions for both convex and locally non-convex loss functions. We further discuss data and model conditions that lead to a “low effective dimension”. Under these conditions, we show that the excess risk either does not depend on the ambient dimension p or depends on p via a poly-logarithmic factor. We also demonstrate that in several widely used statistical models, the “low effective dimension” arises naturally in overparameterized settings. The studied statistical applications include both convex models such as linear regression and logistic regression and non-convex models such as M-estimator and two-layer neural networks.

Funding Statement

Xi Chen would like to thank the support from NSF via the grant IIS-1845444. Qiang Liu was a postdoc Research Fellow at the Department of Mathematics, National University of Singapore when the manuscript was submitted, with the financial support from Singapore MOE via the grant R-146-000-258-114. Now, he is an Assistant Professor at the School of Statistics and Management, Shanghai University of Finance and Economics, and his work is supported by Fundamental Research Funds for the Central Universities, Shanghai University of Finance and Economics (No. 2021110482, No. 2022110007). Xin T. Tong would like to thank the support from Singapore MOE via the grant R-146-000-292-114.

Acknowledgments

The authors thank the suggestions made by the editor and anonymous reviewers.

Citation

Download Citation

Xi Chen. Qiang Liu. Xin T. Tong. "Dimension independent excess risk by stochastic gradient descent." Electron. J. Statist. 16 (2) 4547 - 4603, 2022. https://doi.org/10.1214/22-EJS2055

Information

Received: 1 March 2021; Published: 2022
First available in Project Euclid: 19 September 2022

MathSciNet: MR4489235
zbMATH: 07603093
Digital Object Identifier: 10.1214/22-EJS2055

Subjects:
Primary: 49N60
Secondary: 49M25 , 49M37

Keywords: Dimension independent , excess risk , regularization , Stochastic gradient descent , weak non-convexity

Vol.16 • No. 2 • 2022
Back to Top