Open Access
April 2016 Best subset selection via a modern optimization lens
Dimitris Bertsimas, Angela King, Rahul Mazumder
Ann. Statist. 44(2): 813-852 (April 2016). DOI: 10.1214/15-AOS1388

Abstract

In the period 1991–2015, algorithmic advances in Mixed Integer Optimization (MIO) coupled with hardware improvements have resulted in an astonishing 450 billion factor speedup in solving MIO problems. We present a MIO approach for solving the classical best subset selection problem of choosing $k$ out of $p$ features in linear regression given $n$ observations. We develop a discrete extension of modern first-order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation loss function. Using a wide variety of synthetic and real datasets, we demonstrate that our approach solves problems with $n$ in the 1000s and $p$ in the 100s in minutes to provable optimality, and finds near optimal solutions for $n$ in the 100s and $p$ in the 1000s in minutes. We also establish via numerical experiments that the MIO approach performs better than Lasso and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.

Citation

Download Citation

Dimitris Bertsimas. Angela King. Rahul Mazumder. "Best subset selection via a modern optimization lens." Ann. Statist. 44 (2) 813 - 852, April 2016. https://doi.org/10.1214/15-AOS1388

Information

Received: 1 June 2014; Revised: 1 August 2015; Published: April 2016
First available in Project Euclid: 17 March 2016

zbMATH: 1335.62115
MathSciNet: MR3476618
Digital Object Identifier: 10.1214/15-AOS1388

Subjects:
Primary: 62G35 , 62J05 , 62J07
Secondary: 90C11 , 90C26 , 90C27

Keywords: $\ell_{0}$-constrained minimization , algorithms , Best subset selection , discrete optimization , global optimization , Lasso , least absolute deviation , mixed integer programming , sparse linear regression

Rights: Copyright © 2016 Institute of Mathematical Statistics

Vol.44 • No. 2 • April 2016
Back to Top