Open Access
June, 1988 Two-Stage Bandits
Murray K. Clayton, Jeffrey A. Witmer
Ann. Statist. 16(2): 887-894 (June, 1988). DOI: 10.1214/aos/1176350841

Abstract

Two stochastic processes, or "arms," that yield dichotomous responses are available for use in a two-stage decision problem. During the first stage, arms are chosen sequentially; the resulting observations are discounted by a fixed value $\beta$. A single arm must be used in the second stage, in which observations are not discounted. The decision to end the first stage is based on the data obtained. Optimal strategies are considered in the presence of the random discount sequence that arises in this setting. This extends the work of Berry and Fristedt (1979).

Citation

Download Citation

Murray K. Clayton. Jeffrey A. Witmer. "Two-Stage Bandits." Ann. Statist. 16 (2) 887 - 894, June, 1988. https://doi.org/10.1214/aos/1176350841

Information

Published: June, 1988
First available in Project Euclid: 12 April 2007

zbMATH: 0664.62081
MathSciNet: MR947583
Digital Object Identifier: 10.1214/aos/1176350841

Subjects:
Primary: 62C10

Keywords: random discounting , regular discounting , sequential decisions , Two-stage bandit

Rights: Copyright © 1988 Institute of Mathematical Statistics

Vol.16 • No. 2 • June, 1988
Back to Top