Open Access
October, 1965 The Robbins-Isbell Two-Armed-Bandit Problem with Finite Memory
Carter Vincent Smith, Ronald Pyke
Ann. Math. Statist. 36(5): 1375-1386 (October, 1965). DOI: 10.1214/aoms/1177699897


This paper studies the sequential decision model known as the two-armed-bandit with finite memory. It was introduced by Robbins [8] in 1956 and studied further by Isbell [5] in 1959. In this paper, a set of rules is defined which are uniformly better than those given in [5] and [8]. A much larger class of rules is then defined, one member of which is conjectured to be a uniformly best rule.


Download Citation

Carter Vincent Smith. Ronald Pyke. "The Robbins-Isbell Two-Armed-Bandit Problem with Finite Memory." Ann. Math. Statist. 36 (5) 1375 - 1386, October, 1965.


Published: October, 1965
First available in Project Euclid: 27 April 2007

zbMATH: 0133.41701
MathSciNet: MR182107
Digital Object Identifier: 10.1214/aoms/1177699897

Rights: Copyright © 1965 Institute of Mathematical Statistics

Vol.36 • No. 5 • October, 1965
Back to Top