Open Access
February 2020 The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression
Emmanuel J. Candès, Pragya Sur
Ann. Statist. 48(1): 27-42 (February 2020). DOI: 10.1214/18-AOS1789

Abstract

This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp “phase transition.” We introduce an explicit boundary curve $h_{\mathrm{MLE}}$, parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes $n$ and number of features $p$ proportioned in such a way that $p/n\rightarrow \kappa $, we show that if the problem is sufficiently high dimensional in the sense that $\kappa >h_{\mathrm{MLE}}$, then the MLE does not exist with probability one. Conversely, if $\kappa <h_{\mathrm{MLE}}$, the MLE asymptotically exists with probability one.

Citation

Download Citation

Emmanuel J. Candès. Pragya Sur. "The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression." Ann. Statist. 48 (1) 27 - 42, February 2020. https://doi.org/10.1214/18-AOS1789

Information

Received: 1 October 2018; Published: February 2020
First available in Project Euclid: 17 February 2020

zbMATH: 07196528
MathSciNet: MR4065151
Digital Object Identifier: 10.1214/18-AOS1789

Subjects:
Primary: 62E20 , 62J12

Keywords: High-dimensional logistic regression , MLE phase transition

Rights: Copyright © 2020 Institute of Mathematical Statistics

Vol.48 • No. 1 • February 2020
Back to Top