Open Access
April, 1981 Arm-Acquiring Bandits
P. Whittle
Ann. Probab. 9(2): 284-292 (April, 1981). DOI: 10.1214/aop/1176994469


We consider the problem of allocating effort between projects at different stages of development when new projects are also continually appearing. An expression (14) is derived for the expected reward yielded by the Gittins index policy. This is shown to satisfy the dynamic programming equation for the problem, so confirming optimality of the policy.


Download Citation

P. Whittle. "Arm-Acquiring Bandits." Ann. Probab. 9 (2) 284 - 292, April, 1981.


Published: April, 1981
First available in Project Euclid: 19 April 2007

zbMATH: 0464.90081
MathSciNet: MR606990
Digital Object Identifier: 10.1214/aop/1176994469

Primary: 42C99
Secondary: 62C99

Keywords: allocation index , dynamic programming , Multiarmed bandit

Rights: Copyright © 1981 Institute of Mathematical Statistics

Vol.9 • No. 2 • April, 1981
Back to Top