Open Access
2017 Optimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators
Yuchen Zhang, Martin J. Wainwright, Michael I. Jordan
Electron. J. Statist. 11(1): 752-799 (2017). DOI: 10.1214/17-EJS1233

Abstract

For the problem of high-dimensional sparse linear regression, it is known that an $\ell_{0}$-based estimator can achieve a $1/n$ “fast” rate for prediction error without any conditions on the design matrix, whereas in the absence of restrictive conditions on the design matrix, popular polynomial-time methods only guarantee the $1/\sqrt{n}$ “slow” rate. In this paper, we show that the slow rate is intrinsic to a broad class of M-estimators. In particular, for estimators based on minimizing a least-squares cost function together with a (possibly nonconvex) coordinate-wise separable regularizer, there is always a “bad” local optimum such that the associated prediction error is lower bounded by a constant multiple of $1/\sqrt{n}$. For convex regularizers, this lower bound applies to all global optima. The theory is applicable to many popular estimators, including convex $\ell_{1}$-based methods as well as M-estimators based on nonconvex regularizers, including the SCAD penalty or the MCP regularizer. In addition, we show that bad local optima are very common, in that a broad class of local minimization algorithms with random initialization typically converge to a bad solution.

Citation

Download Citation

Yuchen Zhang. Martin J. Wainwright. Michael I. Jordan. "Optimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators." Electron. J. Statist. 11 (1) 752 - 799, 2017. https://doi.org/10.1214/17-EJS1233

Information

Received: 1 November 2015; Published: 2017
First available in Project Euclid: 11 March 2017

zbMATH: 1362.62053
MathSciNet: MR3622646
Digital Object Identifier: 10.1214/17-EJS1233

Subjects:
Primary: 62F12
Secondary: 62J05

Keywords: computationally-constrained minimax theory , High-dimensional statistics , nonconvex optimization , sparse linear regression

Vol.11 • No. 1 • 2017
Back to Top