We consider the Bayesian active learning and experimental design problem, where the goal is to learn the value of some unknown target variable through a sequence of informative, noisy tests. In contrast to prior work, we focus on the challenging, yet practically relevant setting where test outcomes can be conditionally dependent given the hidden target variable. Under such assumptions, common heuristics, such as greedily performing tests that maximize the reduction in uncertainty of the target, often perform poorly.
We propose ECED, a novel, efficient active learning algorithm, and prove strong theoretical guarantees that hold with correlated, noisy tests. Rather than directly optimizing the prediction error, at each step, ECED picks the test that maximizes the gain in a surrogate objective, which takes into account the dependencies between tests. Our analysis relies on an information-theoretic auxiliary function to track the progress of ECED, and utilizes adaptive submodularity to attain the approximation bound. We demonstrate strong empirical performance of ECED on three problem instances, including a Bayesian experimental design task intended to distinguish among economic theories of how people make risky decisions, an active preference learning task via pairwise comparisons, and a third application on pool-based active learning.
"Near-optimal Bayesian active learning with correlated and noisy tests." Electron. J. Statist. 11 (2) 4969 - 5017, 2017. https://doi.org/10.1214/17-EJS1336SI