Open Access
March, 1985 Further Contributions to the "Two-Armed Bandit" Problem
Robert Keener
Ann. Statist. 13(1): 418-422 (March, 1985). DOI: 10.1214/aos/1176346603

Abstract

A version of the two-armed bandit with two states of nature and two repeatable experiments is studied. With an infinite horizon and with or without discounting, an optimal procedure is to perform one experiment whenever the posterior probability of one of the states of nature exceeds a constant $\xi^\ast$, and perform the other experiment whenever the posterior is less than $\xi^\ast$ with indifference when the posterior equals $\xi^\ast. \xi^\ast$ is expressed in terms involving expectations of ladder variables and can be calculated using Spitzer series.

Citation

Download Citation

Robert Keener. "Further Contributions to the "Two-Armed Bandit" Problem." Ann. Statist. 13 (1) 418 - 422, March, 1985. https://doi.org/10.1214/aos/1176346603

Information

Published: March, 1985
First available in Project Euclid: 12 April 2007

zbMATH: 0567.62067
MathSciNet: MR773178
Digital Object Identifier: 10.1214/aos/1176346603

Subjects:
Primary: 62L05
Secondary: 62L10

Keywords: dynamic programming , Random walks , sequential design

Rights: Copyright © 1985 Institute of Mathematical Statistics

Vol.13 • No. 1 • March, 1985
Back to Top