Open Access
October 1998 Boosting the margin: a new explanation for the effectiveness of voting methods
Peter Bartlett, Yoav Freund, Wee Sun Lee, Robert E. Schapire
Ann. Statist. 26(5): 1651-1686 (October 1998). DOI: 10.1214/aos/1024691352

Abstract

One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this phenomenon is related to the distribution of margins of the training examples with respect to the generated voting classification rule, where the margin of an example is simply the difference between the number of correct votes and the maximum number of votes received by any incorrect label. We show that techniques used in the analysis of Vapnik’s support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error. We also show theoretically and experimentally that boosting is especially effective at increasing the margins of the training examples. Finally, we compare our explanation to those based on the bias-variance decomposition.

Citation

Download Citation

Peter Bartlett. Yoav Freund. Wee Sun Lee. Robert E. Schapire. "Boosting the margin: a new explanation for the effectiveness of voting methods." Ann. Statist. 26 (5) 1651 - 1686, October 1998. https://doi.org/10.1214/aos/1024691352

Information

Published: October 1998
First available in Project Euclid: 21 June 2002

zbMATH: 0929.62069
MathSciNet: MR1673273
Digital Object Identifier: 10.1214/aos/1024691352

Subjects:
Primary: 62H30

Keywords: bagging , boosting , decision trees , Ensemble methods , error-correcting , Markov chain , Monte Carlo , neural networks , output coding

Rights: Copyright © 1998 Institute of Mathematical Statistics

Vol.26 • No. 5 • October 1998
Back to Top