The Annals of Applied Statistics

New multicategory boosting algorithms based on multicategory Fisher-consistent losses

Hui Zou, Ji Zhu, and Trevor Hastie

Full-text: Open access

Abstract

Fisher-consistent loss functions play a fundamental role in the construction of successful binary margin-based classifiers. In this paper we establish the Fisher-consistency condition for multicategory classification problems. Our approach uses the margin vector concept which can be regarded as a multicategory generalization of the binary margin. We characterize a wide class of smooth convex loss functions that are Fisher-consistent for multicategory classification. We then consider using the margin-vector-based loss functions to derive multicategory boosting algorithms. In particular, we derive two new multicategory boosting algorithms by using the exponential and logistic regression losses.

Article information

Source
Ann. Appl. Stat. Volume 2, Number 4 (2008), 1290-1306.

Dates
First available in Project Euclid: 8 January 2009

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1231424211

Digital Object Identifier
doi:10.1214/08-AOAS198

Mathematical Reviews number (MathSciNet)
MR2655660

Zentralblatt MATH identifier
1158.62044

Keywords
Boosting Fisher-consistent losses multicategory classification

Citation

Zou, Hui; Zhu, Ji; Hastie, Trevor. New multicategory boosting algorithms based on multicategory Fisher-consistent losses. Ann. Appl. Stat. 2 (2008), no. 4, 1290--1306. doi:10.1214/08-AOAS198. https://projecteuclid.org/euclid.aoas/1231424211.


Export citation

References

  • Allwein, E., Schapire, R. and Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. J. Mach. Learn. Res. 1 113–141.
  • Blanchard, G., Lugosi, G. and Vayatis, N. (2004). On the rate of convergence of regularized boosting classifiers. J. Mach. Learn. Res. 4 861–894.
  • Bühlmann, P. and Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting (with discussion). Statist. Sci. 22 477–505.
  • Bühlmann, P. and Yu, B. (2003). Boosting with the l2 loss: Regression and classification. J. Amer. Statist. Assoc. 98 324–339.
  • Buja, A., Stuetzle, W. and Shen, Y. (2005). Loss functions for binary class probability estimation and classification: Structure and applications. Technical report, Dept. of Statistics, Univ. Pennsylvania.
  • Newman, C. B. D. J., Hettich, S. and Merz, C. (1998). UCI repository of machine learning databases. Dept. of Information and Computer Sciences, Univ. California, Irvine. http://www.ics.uci.edu/~mlearn/mlrepository.html. Available at.
  • Freund, Y. and Schapire, R. (1997). A decision-theoretic generalization of online learning and an application to boosting. J. Comput. System Sci. 55 119–139.
  • Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Ann. Statist. 29 1189–1232.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting (with discussion). Ann. Statist. 28 337–407.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning. Data Mining, Inference and Prediction. Springer, New York.
  • Koltchinskii, V. and Panchenko, D. (2002). Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30 1–50.
  • Lee, Y., Lin, Y. and Wahba, G. (2004). Multicategory support vector machines, theory, and application to the classification of microarray data and satellite radiance data. J. Amer. Statist. Assoc. 99 67–81.
  • Lin, Y. (2002). Support vector machines and the Bayes rule in classification. Data Mining and Knowledge Discovery 6 259–275.
  • Lin, Y. (2004). A note on margin-based loss functions in classification. Statist. Probab. Lett. 68 73–82.
  • Liu, Y. and Shen, X. (2006). Multicategory ψ learning. J. Amer. Statist. Assoc. 101 500–509.
  • Liu, Y., Shen, X. and Doss, H. (2005). Multicategory psi-learning and support vector machine: Computational tools. J. Comput. Graph. Statist. 14 219–236.
  • Lugosi, G. and Vayatis, N. (2004). On the Bayes-risk consistency of regularized boosting methods (with discussion). Ann. Statist. 32 30–55.
  • Rifkin, R. and Klautau, A. (2004). In defense of one-vs-all classification. J. Mach. Learn. Res. 5 101–141.
  • Schapire, R. and Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning 37 297–336.
  • Vapnik, V. (1996). The Nature of Statistical Learning. Springer, New York.
  • Zhang, T. (2004a). Statistical analysis of some multi-category large margin classification methods. J. Mach. Learn. Res. 5 1225–1251.
  • Zhang, T. (2004b). Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32 469–475.