Open Access
Translator Disclaimer
October 1997 Bandit problems with infinitely many arms
Donald A. Berry, Robert W. Chen, Alan Zame, David C. Heath, Larry A. Shepp
Ann. Statist. 25(5): 2103-2116 (October 1997). DOI: 10.1214/aos/1069362389


We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with $n \to \infty$. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between $\sqrt{2}/\sqrt{n}$ and $2/\sqrt{n}$ and we exhibit classes of strategies that achieve the latter.


Download Citation

Donald A. Berry. Robert W. Chen. Alan Zame. David C. Heath. Larry A. Shepp. "Bandit problems with infinitely many arms." Ann. Statist. 25 (5) 2103 - 2116, October 1997.


Published: October 1997
First available in Project Euclid: 20 November 2003

zbMATH: 0881.62083
MathSciNet: MR1474085
Digital Object Identifier: 10.1214/aos/1069362389

Primary: 60F99 , 62C25 , 62L05

Keywords: bandit problems , dynamic allocation of Bernoulli processes , Sequential experimentation , staying with a winner , switching with a loser

Rights: Copyright © 1997 Institute of Mathematical Statistics


Vol.25 • No. 5 • October 1997
Back to Top