A precise high-dimensional asymptotic theory for boosting and minimum-ℓ1-norm interpolated classifiers

Tengyuan Liang; Pragya Sur

doi:10.1214/22-AOS2170

Abstract

This paper establishes a precise high-dimensional asymptotic theory for boosting on separable data, taking statistical and computational perspectives. We consider a high-dimensional setting where the number of features (weak learners) p scales with the sample size n, in an overparametrized regime. Under a class of statistical models, we provide an exact analysis of the generalization error of boosting when the algorithm interpolates the training data and maximizes the empirical ${\ell _{1}}$ -margin. Further, we explicitly pin down the relation between the boosting test error and the optimal Bayes error, as well as the proportion of active features at interpolation (with zero initialization). In turn, these precise characterizations answer certain questions raised in (Neural Comput. 11 (1999) 1493–1517; Ann. Statist. 26 (1998) 1651–1686) surrounding boosting, under assumed data generating processes. At the heart of our theory lies an in-depth study of the maximum- ${\ell _{1}}$ -margin, which can be accurately described by a new system of nonlinear equations; to analyze this margin, we rely on Gaussian comparison techniques and develop a novel uniform deviation argument. Our statistical and computational arguments can handle (1) any finite-rank spiked covariance model for the feature distribution and (2) variants of boosting corresponding to general ${\ell _{q}}$ -geometry, $q\in [1,2]$ . As a final component, via the Lindeberg principle, we establish a universality result showcasing that the scaled ${\ell _{1}}$ -margin (asymptotically) remains the same, whether the covariates used for boosting arise from a nonlinear random feature model or an appropriately linearized model with matching moments.

Funding Statement

T. Liang acknowledges support from the NSF Career award (DMS-2042473), the George C. Tiao faculty fellowship and the William S. Fishman faculty research fund at the University of Chicago Booth School of Business.
P. Sur was partially supported by the Center for Research on Computation and Society, Harvard John A. Paulson School of Engineering and Applied Sciences and by NSF DMS-2113426.

Acknowledgments

T. Liang wishes to thank Yoav Freund, Bin Yu, Misha Belkin, as well as participants in the Learning Theory seminar at Google Research and NSF-Simons Collaboration on Mathematics of Deep Learning for constructive feedback that greatly improved the paper.

P. Sur wishes to thank the organizers and participants of the Young Data Science Researcher Seminar, ETH Zurich, for constructive feedback.

Both authors gratefully thank the anonymous referees and the associate editor who provided helpful comments that greatly improved the manuscript.

Citation

Download Citation

Tengyuan Liang. Pragya Sur. "A precise high-dimensional asymptotic theory for boosting and minimum- ${\ell _{1}}$ -norm interpolated classifiers." Ann. Statist. 50 (3) 1669 - 1695, June 2022. https://doi.org/10.1214/22-AOS2170

Information

Received: 1 July 2021; Revised: 1 December 2021; Published: June 2022

First available in Project Euclid: 16 June 2022

MathSciNet: MR4441136

zbMATH: 1490.68188

Digital Object Identifier: 10.1214/22-AOS2170

Subjects:

Primary: 68Q32

Secondary: 62H30

Keywords: boosting , high-dimensional asymptotics , margin theory , Minimum-norm interpolation , overparametrization

Abstract

Funding Statement

Acknowledgments

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS