## The Annals of Statistics

### A Conjecture of Berry Regarding A Bernoulli Two-Armed Bandit

V. M. Joshi

#### Abstract

Two independent Bernoulli processes (arms) have unknown success probabilities $\rho$ and $\lambda$. The initial (a priori) information about $\rho$ and $\lambda$ is expressed by probability distributions $dR(\rho) = C_R \rho{^r_0}(1 - \rho)^{r_0'} d\mu(\rho) \text{for the right arm},$ and $dL(\lambda) = C_L \lambda^{l_0}(1 - \lambda)^{l_0'} d\mu(\lambda) \text{for the left arm},$ where $\mu$ is any arbitrary measure on the unit interval. A specified number $n$ of observations is made sequentially, the arm selected at each stage depending on the previous observations and the initial information. A conjecture of Berry states that if the initial information present about the right arm (given by $r_0 + r_0'$) is not greater than that present for the left arm $(l_0 + l_0')$ and the initial expected value of $\rho$ is not less than that of $\lambda$, then for any $n$ the advantage (in terms of expected number of successes) of taking the first observation on the right arm is never less than that for the left arm. A proof of this conjecture is given in this paper.

#### Article information

Source
Ann. Statist., Volume 3, Number 1 (1975), 189-202.

Dates
First available in Project Euclid: 12 April 2007

https://projecteuclid.org/euclid.aos/1176343007

Digital Object Identifier
doi:10.1214/aos/1176343007

Mathematical Reviews number (MathSciNet)
MR359212

Zentralblatt MATH identifier
0318.62006

JSTOR