Open Access
Translator Disclaimer
2019 Surrogate losses in passive and active learning
Steve Hanneke, Liu Yang
Electron. J. Statist. 13(2): 4646-4708 (2019). DOI: 10.1214/19-EJS1635


Active learning is a type of sequential design for supervised machine learning, in which the learning algorithm sequentially requests the labels of selected instances from a large pool of unlabeled data points. The objective is to produce a classifier of relatively low risk, as measured under the $0$-$1$ loss, ideally using fewer label requests than the number of random labeled data points sufficient to achieve the same. This work investigates the potential uses of surrogate loss functions in the context of active learning. Specifically, it presents an active learning algorithm based on an arbitrary classification-calibrated surrogate loss function, along with an analysis of the number of label requests sufficient for the classifier returned by the algorithm to achieve a given risk under the $0$-$1$ loss. Interestingly, these results cannot be obtained by simply optimizing the surrogate risk via active learning to an extent sufficient to provide a guarantee on the $0$-$1$ loss, as is common practice in the analysis of surrogate losses for passive learning. Some of the results have additional implications for the use of surrogate losses in passive learning.


Download Citation

Steve Hanneke. Liu Yang. "Surrogate losses in passive and active learning." Electron. J. Statist. 13 (2) 4646 - 4708, 2019.


Received: 1 June 2018; Published: 2019
First available in Project Euclid: 13 November 2019

zbMATH: 07136627
MathSciNet: MR4030368
Digital Object Identifier: 10.1214/19-EJS1635

Primary: 62H30 , 62L05 , 68Q32 , 68T05
Secondary: 62G99 , 68Q10 , 68Q25 , 68T10 , 68W40

Keywords: Active learning , ‎classification‎ , selective sampling , sequential design , statistical learning theory , surrogate loss functions


Vol.13 • No. 2 • 2019
Back to Top