August 2022 Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices
Jaouad Mourtada
Author Affiliations +
Ann. Statist. 50(4): 2157-2178 (August 2022). DOI: 10.1214/22-AOS2181

Abstract

We consider random-design linear prediction and related questions on the lower tail of random matrices. It is known that, under boundedness constraints, the minimax risk is of order d/n in dimension d with n samples. Here, we study the minimax expected excess risk over the full linear class, depending on the distribution of covariates. First, the least squares estimator is exactly minimax optimal in the well-specified case, for every distribution of covariates. We express the minimax risk in terms of the distribution of statistical leverage scores of individual samples, and deduce a minimax lower bound of d/(nd+1) for any covariate distribution, nearly matching the risk for Gaussian design. We then obtain sharp nonasymptotic upper bounds for covariates that satisfy a “small ball”-type regularity condition in both well-specified and misspecified cases.

Our main technical contribution is the study of the lower tail of the smallest singular value of empirical covariance matrices at small values. We establish a lower bound on this lower tail, valid for any distribution in dimension d2, together with a matching upper bound under a necessary regularity condition. Our proof relies on the PAC-Bayes technique for controlling empirical processes, and extends an analysis of Oliveira devoted to a different part of the lower tail.

Funding Statement

Part of this work was carried at Centre de Mathématiques Appliquées, École polytechnique, France, and supported by a public grant as part of the Investissement d’avenir project, reference ANR-11-LABX-0056-LMH, LabEx LMH. Part of this work was carried out at the Machine Learning Genoa center, Università di Genova, Italy.

Acknowledgments

The author would like to thank two anonymous referees and an associate editor for very helpful comments that improved the quality of this paper.

Citation

Download Citation

Jaouad Mourtada. "Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices." Ann. Statist. 50 (4) 2157 - 2178, August 2022. https://doi.org/10.1214/22-AOS2181

Information

Received: 1 May 2020; Revised: 1 October 2021; Published: August 2022
First available in Project Euclid: 25 August 2022

MathSciNet: MR4474486
zbMATH: 1500.62002
Digital Object Identifier: 10.1214/22-AOS2181

Subjects:
Primary: 62J05
Secondary: 60B20 , 62C20

Keywords: anticoncentration , Covariance matrices , decision theory , least squares , lower bounds , statistical learning theory

Rights: Copyright © 2022 Institute of Mathematical Statistics

Vol.50 • No. 4 • August 2022
Back to Top