Open Access
2021 Finite-sample analysis of $M$-estimators using self-concordance
Dmitrii M. Ostrovskii, Francis Bach
Electron. J. Statist. 15(1): 326-391 (2021). DOI: 10.1214/20-EJS1780

Abstract

The classical asymptotic theory for parametric $M$-estimators guarantees that, in the limit of infinite sample size, the excess risk has a chi-square type distribution, even in the misspecified case. We demonstrate how self-concordance of the loss allows to characterize the critical sample size sufficient to guarantee a chi-square type in-probability bound for the excess risk. Specifically, we consider two classes of losses: (i) self-concordant losses in the classical sense of Nesterov and Nemirovski, i.e., whose third derivative is uniformly bounded with the $3/2$ power of the second derivative; (ii) pseudo self-concordant losses, for which the power is removed. These classes contain losses corresponding to several generalized linear models, including the logistic loss and pseudo-Huber losses.

Our basic result under minimal assumptions bounds the critical sample size by $O(d\cdot d_{\text{eff}})$, where $d$ the parameter dimension and $d_{\text{eff}}$ the effective dimension that accounts for model misspecification. In contrast to the existing results, we only impose local assumptions that concern the population risk minimizer $\theta _{*}$. Namely, we assume that the calibrated predictors, i.e., predictors scaled by the square root of the second derivative of the loss, is subgaussian at $\theta _{*}$. Besides, for type-ii losses we require boundedness of certain measure of curvature of the population risk at $\theta _{*}$.

Our improved result bounds the critical sample size from above as \begin{equation*}O(\max \{d_{\text{eff}},d\log d\})\end{equation*} under slightly stronger assumptions. Namely, the local assumptions must hold in the neighborhood of $\theta _{*}$ given by the Dikin ellipsoid of the population risk. Interestingly, we find that, for logistic regression with Gaussian design, there is no actual restriction of conditions: the subgaussian parameter and curvature measure remain near-constant over the Dikin ellipsoid. Finally, we extend some of these results to $\ell _{1}$-penalized estimators in high dimensions.

Citation

Download Citation

Dmitrii M. Ostrovskii. Francis Bach. "Finite-sample analysis of $M$-estimators using self-concordance." Electron. J. Statist. 15 (1) 326 - 391, 2021. https://doi.org/10.1214/20-EJS1780

Information

Received: 1 March 2020; Published: 2021
First available in Project Euclid: 6 January 2021

Digital Object Identifier: 10.1214/20-EJS1780

Subjects:
Primary: 62F10 , 62F12 , 62F99
Secondary: 90C90

Keywords: $M$-estimators , empirical risk minimization , fast rates , logistic regression , random design , robustness , self-concordance

Vol.15 • No. 1 • 2021
Back to Top