The Annals of Statistics

Further Contributions to the "Two-Armed Bandit" Problem

Robert Keener

Full-text: Open access

Abstract

A version of the two-armed bandit with two states of nature and two repeatable experiments is studied. With an infinite horizon and with or without discounting, an optimal procedure is to perform one experiment whenever the posterior probability of one of the states of nature exceeds a constant $\xi^\ast$, and perform the other experiment whenever the posterior is less than $\xi^\ast$ with indifference when the posterior equals $\xi^\ast. \xi^\ast$ is expressed in terms involving expectations of ladder variables and can be calculated using Spitzer series.

Article information

Source
Ann. Statist., Volume 13, Number 1 (1985), 418-422.

Dates
First available in Project Euclid: 12 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1176346603

Digital Object Identifier
doi:10.1214/aos/1176346603

Mathematical Reviews number (MathSciNet)
MR773178

Zentralblatt MATH identifier
0567.62067

JSTOR
links.jstor.org

Subjects
Primary: 62L05: Sequential design
Secondary: 62L10: Sequential analysis

Keywords
Dynamic programming sequential design random walks

Citation

Keener, Robert. Further Contributions to the "Two-Armed Bandit" Problem. Ann. Statist. 13 (1985), no. 1, 418--422. doi:10.1214/aos/1176346603. https://projecteuclid.org/euclid.aos/1176346603


Export citation