Abstract
A problem that seems to be of some practical importance is how to select the best one of $k$ experimental categories or populations when there is a fixed probability for each population that any measurement will be classified as a `success' and the best population is defined as the one with the greatest probability of a success. For example, we might be interested in determining which of $k$ new drugs offers the greatest probability of survival against a specified disease, or which of $k$ new production techniques has the greatest probability of producing a `good' item. A treatment of this problem using a fixed sample size approach was given by Sobel and Huyett [6]. To describe their formulation of the problem, denote the populations by $\prod_1, \prod_2, \cdots, \prod_k$, the corresponding probabilities by $p_1, p_2, \cdots, p_k$, and the ordered probabilities by $p_{\lbrack 1\rbrack} \geqq p_{\lbrack 2\rbrack} \geqq \cdots \geqq p_{\lbrack k\rbrack}$, and let $\prod_{\lbrack j\rbrack}$ be the population associated with $p_{\lbrack j\rbrack}$. Then [6] described a statistical procedure and gave tables for determining the common sample size required with each population so that population $\prod_{\lbrack 1\rbrack}$ will be selected with probability $\geqq P^\ast$ whenever $p_{\lbrack 1\rbrack} \geqq p_{\lbrack 2\rbrack} + d$, where $d$ and $P^\ast$ are constants selected in advance of the experiment. This formulation of the problem, which we will call the main formulation, seems satisfactory when nothing is known about the magnitude of $(p_1, p_2, \cdots, p_k)$ or if there is some a priori information available which indicates that $p_{\lbrack 1\rbrack}$ and $p_{\lbrack 2\rbrack}$ do not differ too much from .5, say .25 $\leqq p_{\lbrack 2\rbrack} \leqq p_{\lbrack 1\rbrack} \leqq .75$. An alternative formulation of the problem when the a priori information indicates that $p_{\lbrack 2\rbrack}$ and $p_{\lbrack 1\rbrack}$ differ substantially from .5 was given in [6] as follows: the sample size is determined so that population $\prod_{\lbrack 1\rbrack}$ is selected with probability $\geqq P^\ast$ whenever $p_{\lbrack 2\rbrack} \leqq p^\ast_{\lbrack 2\rbrack}$ and $p_{\lbrack 1\rbrack} \geqq p^\ast_{\lbrack 2\rbrack} + d$, where $p^\ast_{\lbrack 2\rbrack}$ is an additional constant determined in advance of the experiment on the basis of the a priori information about the probable value of $p_{\lbrack 2\rbrack}$. The present paper is based on a somewhat novel use of the Poisson distribution to obtain a random number of measurements from each population at every stage of experiment combined with the application of one-sided sequential confidence limits developed in [4]. Using these techniques we derive sequential procedures for selecting the best population both for the main formulation and for a generalization of the alternative formulation of the problem. Some Monte Carlo calculations which are summarized in Section 5 indicate that a substantial saving is possible with the sequential procedures.
Citation
Edward Paulson. "Sequential Procedures for Selecting the Best One of Several Binomial Populations." Ann. Math. Statist. 38 (1) 117 - 123, February, 1967. https://doi.org/10.1214/aoms/1177699062
Information