The Annals of Statistics

Two-Stage Bandits

Murray K. Clayton and Jeffrey A. Witmer

Full-text: Open access

Abstract

Two stochastic processes, or "arms," that yield dichotomous responses are available for use in a two-stage decision problem. During the first stage, arms are chosen sequentially; the resulting observations are discounted by a fixed value $\beta$. A single arm must be used in the second stage, in which observations are not discounted. The decision to end the first stage is based on the data obtained. Optimal strategies are considered in the presence of the random discount sequence that arises in this setting. This extends the work of Berry and Fristedt (1979).

Article information

Source
Ann. Statist., Volume 16, Number 2 (1988), 887-894.

Dates
First available in Project Euclid: 12 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1176350841

Digital Object Identifier
doi:10.1214/aos/1176350841

Mathematical Reviews number (MathSciNet)
MR947583

Zentralblatt MATH identifier
0664.62081

JSTOR
links.jstor.org

Subjects
Primary: 62C10: Bayesian problems; characterization of Bayes procedures

Keywords
Two-stage bandit sequential decisions regular discounting random discounting

Citation

Clayton, Murray K.; Witmer, Jeffrey A. Two-Stage Bandits. Ann. Statist. 16 (1988), no. 2, 887--894. doi:10.1214/aos/1176350841. https://projecteuclid.org/euclid.aos/1176350841


Export citation