Electronic Journal of Statistics

Near-optimal Bayesian active learning with correlated and noisy tests

Yuxin Chen, S. Hamed Hassani, and Andreas Krause

Full-text: Open access


We consider the Bayesian active learning and experimental design problem, where the goal is to learn the value of some unknown target variable through a sequence of informative, noisy tests. In contrast to prior work, we focus on the challenging, yet practically relevant setting where test outcomes can be conditionally dependent given the hidden target variable. Under such assumptions, common heuristics, such as greedily performing tests that maximize the reduction in uncertainty of the target, often perform poorly.

We propose ECED, a novel, efficient active learning algorithm, and prove strong theoretical guarantees that hold with correlated, noisy tests. Rather than directly optimizing the prediction error, at each step, ECED picks the test that maximizes the gain in a surrogate objective, which takes into account the dependencies between tests. Our analysis relies on an information-theoretic auxiliary function to track the progress of ECED, and utilizes adaptive submodularity to attain the approximation bound. We demonstrate strong empirical performance of ECED on three problem instances, including a Bayesian experimental design task intended to distinguish among economic theories of how people make risky decisions, an active preference learning task via pairwise comparisons, and a third application on pool-based active learning.

Article information

Electron. J. Statist., Volume 11, Number 2 (2017), 4969-5017.

Received: June 2017
First available in Project Euclid: 15 December 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian active learning information gathering decision making noisy observation approximation algorithms

Creative Commons Attribution 4.0 International License.


Chen, Yuxin; Hassani, S. Hamed; Krause, Andreas. Near-optimal Bayesian active learning with correlated and noisy tests. Electron. J. Statist. 11 (2017), no. 2, 4969--5017. doi:10.1214/17-EJS1336SI. https://projecteuclid.org/euclid.ejs/1513306865

Export citation


  • [1] Maria-Florina Balcan and Ruth Urner. Active learning–modern learning theory., Encyclopedia of Algorithms, 2015.
  • [2] G. Bellala, S. Bhavnani, and C. Scott. Extensions of generalized binary search to group identification and exponential costs. In, NIPS, 2010.
  • [3] Ralph Allan Bradley and Milton E Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons., Biometrika, 39(3/4):324–345, 1952.
  • [4] V. T. Chakaravarthy, V. Pandit, S. Roy, P. Awasthi, and M. Mohania. Decision trees for entity identification: Approximation algorithms and hardness results. In, SIGMOD/PODS, 2007.
  • [5] K. Chaloner and I. Verdinelli. Bayesian experimental design: A review., Statistical Science, 10(3):273–304, 1995.
  • [6] Yuxin Chen and Andreas Krause. Near-optimal batch mode active learning and adaptive submodular optimization. In, ICML, 2013.
  • [7] Yuxin Chen, S. Hamed Hassani, Amin Karbasi, and Andreas Krause. Sequential information maximization: When is greedy near-optimal? In, COLT, 2015a.
  • [8] Yuxin Chen, Shervin Javdani, Amin Karbasi, James Andrew Bagnell, Siddhartha Srinivasa, and Andreas Krause. Submodular surrogates for value of information. In, AAAI, 2015b.
  • [9] S. Dasgupta. Analysis of a greedy active learning strategy. In, NIPS, 2004a.
  • [10] Sanjoy Dasgupta. Analysis of a greedy active learning strategy. In, NIPS, 2004b.
  • [11] Amol Deshpande, Lisa Hellerstein, and Devorah Kletenik. Approximation algorithms for stochastic boolean function evaluation and stochastic submodular set cover. In, Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1453–1467. SIAM, 2014.
  • [12] Valerii Vadimovich Fedorov., Theory of optimal experiments. Elsevier, 1972.
  • [13] Daniel Golovin and Andreas Krause. Adaptive submodularity: Theory and applications in active learning and stochastic optimization., JAIR, 2011.
  • [14] Daniel Golovin, Andreas Krause, and Debajyoti Ray. Near-optimal bayesian active learning with noisy observations. In, NIPS, 2010.
  • [15] Steve Hanneke. A bound on the label complexity of agnostic active learning. In, ICML, 2007.
  • [16] Steve Hanneke. Theory of disagreement-based active learning., Foundations and Trends® in Machine Learning, 7(2–3):131–309, 2014.
  • [17] David Heckerman, John Breese, and Koos Rommelse. Troubleshooting under uncertainty. Technical report, Technical Report MSR-TR-94-07, Microsoft Research, 1994.
  • [18] Jonathan L. Herlocker, Joseph A. Konstan, Al Borchers, and John Riedl. An algorithmic framework for performing collaborative filtering. In, SIGIR, 1999.
  • [19] R.A. Howard. Information value theory., Systems Science and Cybernetics, IEEE Trans. on, 2(1):22–26, 1966.
  • [20] Matti Kääriäinen. Active learning in the non-realizable case. In, Algorithmic Learning Theory, pages 63–77, 2006.
  • [21] Haim Kaplan, Eyal Kushilevitz, and Yishay Mansour. Learning with attribute costs. In, Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 356–365. ACM, 2005.
  • [22] Richard M Karp and Robert Kleinberg. Noisy binary search and its applications. In, SODA, 2007.
  • [23] S Rao Kosaraju, Teresa M Przytycka, and Ryan Borgstrom. On an optimal split tree problem. In, Algorithms and Data Structures, pages 157–168. Springer, 1999.
  • [24] Sahand Negahban, Sewoong Oh, and Devavrat Shah. Iterative ranking from pair-wise comparisons. In, NIPS, 2012.
  • [25] Robert Nowak. Noisy generalized binary search. In, NIPS, 2009.
  • [26] Debajyoti Ray, Daniel Golovin, Andreas Krause, and Colin Camerer. Bayesian rapid optimal adaptive design (broad): Method and application distinguishing models of risky choice., Tech. Report, 2012.
  • [27] M. C. Runge, S. J. Converse, and J. E. Lyons. Which uncertainty? using expert elicitation and expected value of information to design an adaptive program., Biological Conservation, 2011.
  • [28] B. Settles., Active Learning. Morgan & Claypool, 2012.
  • [29] Nihar B Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, and Martin J Wainwright. Estimation from pairwise comparisons: Sharp minimax bounds with topology dependence. In, AISTATS, 2015.
  • [30] William F. Sharpe. Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk., The Journal of Finance, 1964.
  • [31] Richard D Smallwood and Edward J Sondik. The optimal control of partially observable markov processes over a finite horizon., Operations Research, 21(5) :1071–1088, 1973.
  • [32] Amos Tversky and Daniel Kahneman. Advances in prospect theory: Cumulative representation of uncertainty., Journal of Risk and Uncertainty, 5(4), 1992.
  • [33] P.P. Wakker., Prospect Theory: For Risk and Ambiguity. Cambridge University Press, 2010.
  • [34] Chicheng Zhang and Kamalika Chaudhuri. Beyond disagreement-based agnostic active learning. In, Advances in Neural Information Processing Systems, pages 442–450, 2014.