Open Access
January, 1975 A Conjecture of Berry Regarding A Bernoulli Two-Armed Bandit
V. M. Joshi
Ann. Statist. 3(1): 189-202 (January, 1975). DOI: 10.1214/aos/1176343007

Abstract

Two independent Bernoulli processes (arms) have unknown success probabilities $\rho$ and $\lambda$. The initial (a priori) information about $\rho$ and $\lambda$ is expressed by probability distributions $dR(\rho) = C_R \rho{^r_0}(1 - \rho)^{r_0'} d\mu(\rho) \text{for the right arm},$ and $dL(\lambda) = C_L \lambda^{l_0}(1 - \lambda)^{l_0'} d\mu(\lambda) \text{for the left arm},$ where $\mu$ is any arbitrary measure on the unit interval. A specified number $n$ of observations is made sequentially, the arm selected at each stage depending on the previous observations and the initial information. A conjecture of Berry states that if the initial information present about the right arm (given by $r_0 + r_0'$) is not greater than that present for the left arm $(l_0 + l_0')$ and the initial expected value of $\rho$ is not less than that of $\lambda$, then for any $n$ the advantage (in terms of expected number of successes) of taking the first observation on the right arm is never less than that for the left arm. A proof of this conjecture is given in this paper.

Citation

Download Citation

V. M. Joshi. "A Conjecture of Berry Regarding A Bernoulli Two-Armed Bandit." Ann. Statist. 3 (1) 189 - 202, January, 1975. https://doi.org/10.1214/aos/1176343007

Information

Published: January, 1975
First available in Project Euclid: 12 April 2007

zbMATH: 0318.62006
MathSciNet: MR359212
Digital Object Identifier: 10.1214/aos/1176343007

Keywords: Bernoulli parameters , Bernoulli two-armed bandit , expected advantage , Prior distributions

Rights: Copyright © 1975 Institute of Mathematical Statistics

Vol.3 • No. 1 • January, 1975
Back to Top