February 2023 Grouped variable selection with discrete optimization: Computational and statistical perspectives
Hussein Hazimeh, Rahul Mazumder, Peter Radchenko
Author Affiliations +
Ann. Statist. 51(1): 1-32 (February 2023). DOI: 10.1214/21-AOS2155

Abstract

We present a new algorithmic framework for grouped variable selection that is based on discrete mathematical optimization. While there exist several appealing approaches based on convex relaxations and nonconvex heuristics, we focus on optimal solutions for the 0-regularized formulation, a problem that is relatively unexplored due to computational challenges. Our methodology covers both high-dimensional linear regression and nonparametric sparse additive modeling with smooth components. Our algorithmic framework consists of approximate and exact algorithms. The approximate algorithms are based on coordinate descent and local search, with runtimes comparable to popular sparse learning algorithms. Our exact algorithm is based on a standalone branch-and-bound (BnB) framework, which can solve the associated mixed integer programming (MIP) problem to certified optimality. By exploiting the problem structure, our custom BnB algorithm can solve to optimality problem instances with 5×106 features and 103 observations in minutes to hours—over 1000 times larger than what is currently possible using state-of-the-art commercial MIP solvers. We also explore statistical properties of the 0-based estimators. We demonstrate, theoretically and empirically, that our proposed estimators have an edge over popular group-sparse estimators in terms of statistical performance in various regimes. We provide an open source implementation of our proposed framework.

Funding Statement

The research was partially supported by the Office of Naval Research (N000141812298, N000142112841, N000142212665) and National Science Foundation (NSF-IIS-1718258).

Acknowledgments

We would like to thank the Associate Editor and the referees for their thoughtful and constructive comments that helped us improve the paper. Hussein Hazimeh contributed to the research when he was a graduate student at MIT. We thank Shibal Ibrahim for his help with the Boston Housing data set experiment.

Citation

Download Citation

Hussein Hazimeh. Rahul Mazumder. Peter Radchenko. "Grouped variable selection with discrete optimization: Computational and statistical perspectives." Ann. Statist. 51 (1) 1 - 32, February 2023. https://doi.org/10.1214/21-AOS2155

Information

Received: 1 March 2021; Revised: 1 November 2021; Published: February 2023
First available in Project Euclid: 23 March 2023

MathSciNet: MR4564847
zbMATH: 07684003
Digital Object Identifier: 10.1214/21-AOS2155

Subjects:
Primary: 62G05 , 62Jxx , 90C06 , 90C11

Keywords: branch and bound algorithms , group variable selection , ℓ0 regularization , mixed integer programming , nonparametric additive models , Sparsity

Rights: Copyright © 2023 Institute of Mathematical Statistics

JOURNAL ARTICLE
32 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.51 • No. 1 • February 2023
Back to Top