December 2018 Multiclass classification, information, divergence and surrogate risk
John Duchi, Khashayar Khosravi, Feng Ruan
Ann. Statist. 46(6B): 3246-3275 (December 2018). DOI: 10.1214/17-AOS1657


We provide a unifying view of statistical information measures, multiway Bayesian hypothesis testing, loss functions for multiclass classification problems and multidistribution $f$-divergences, elaborating equivalence results between all of these objects, and extending existing results for binary outcome spaces to more general ones. We consider a generalization of $f$-divergences to multiple distributions, and we provide a constructive equivalence between divergences, statistical information (in the sense of DeGroot) and losses for multiclass classification. A major application of our results is in multiclass classification problems in which we must both infer a discriminant function $\gamma$—for making predictions on a label $Y$ from datum $X$—and a data representation (or, in the setting of a hypothesis testing problem, an experimental design), represented as a quantizer $\mathsf{q}$ from a family of possible quantizers $\mathsf{Q}$. In this setting, we characterize the equivalence between loss functions, meaning that optimizing either of two losses yields an optimal discriminant and quantizer $\mathsf{q}$, complementing and extending earlier results of Nguyen et al. [Ann. Statist. 37 (2009) 876–904] to the multiclass case. Our results provide a more substantial basis than standard classification calibration results for comparing different losses: we describe the convex losses that are consistent for jointly choosing a data representation and minimizing the (weighted) probability of error in multiclass classification problems.


Received: 1 September 2016; Revised: 1 October 2017; Published: December 2018
First available in Project Euclid: 11 September 2018

zbMATH: 1408.62115
MathSciNet: MR3852651
Digital Object Identifier: 10.1214/17-AOS1657

Primary: 62C05 , 62G10 , 62K05 , 68Q32 , 94A17

Keywords: $f$-divergence , hypothesis test , risk , surrogate loss function

Rights: Copyright © 2018 Institute of Mathematical Statistics

Vol.46 • No. 6B • December 2018
