Annals of Statistics

On Selecting a Subset Containing the Best Population-A Bayesian Approach

Prem K. Goel and Herman Rubin

Full-text: Open access


The problem of selecting a subset of $k$ populations $\pi_1, \cdots, \pi_k,$ which contains the "best" population, is considered. The unknown values $\theta_1, \cdots, \theta_k$ are the characteristics associated with $\pi_1, \cdots, \pi_k$ and the unknown population associated with $\theta_{\lbrack k\rbrack} = \max_i\theta_i$ is called the "best." It is assumed that, given $\mathbf{\theta} = (\theta_1, \cdots, \theta_k)$ the $\operatorname{pdf}$ of the independent random variables $X_1, \cdots, X_k$ belong to a monotone likelihood ratio family, the prior distribution of $\mathbf{\theta}$ is exchangeable, and the loss function is a linear combination of two components, namely the subset size $|s|$ and the distance between the "best" and the "best" in the selected subset $s,$ i.e., $\mathbf{L}(\mathbf{\theta}, s) = c|s| + \lbrack\theta_{\lbrack k\rbrack} - \max_{j : \pi_j \in s} \theta_j\rbrack.$ It is shown that the Bayes rule depends on at most $(k - 1)$ computable expressions. Some lower and upper bounds on the differences of Bayes risks are given to help reduce the amount of computation for the Bayes rule. If $X_i$ has a normal distribution with mean $\theta_i$ and known variance $\sigma^2,$ then it is shown that (i) for $k = 2,$ the Bayes rule with vague prior knowledge and the classical rule are the same if the probability of correct selection, $P^\ast,$ is chosen as a suitable function of $c,$ and (ii) if $c/\sigma \geqq 1/\pi^\frac{1}{2},$ then the Bayes rule selects only one population and if $.2821 \leqq c/\sigma < 1/\pi^\frac{1}{2},$ then it selects at most two populations. The tables for implementing the Bayes rule for normal populations are also given.

Article information

Ann. Statist., Volume 5, Number 5 (1977), 969-983.

First available in Project Euclid: 12 April 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier


Primary: 62F07: Ranking and selection
Secondary: 62G30: Order statistics; empirical distribution functions

Subset selection nonlinear loss function normal populations monotone likelihood ratio family exchangeable prior distribution Bayes rules


Goel, Prem K.; Rubin, Herman. On Selecting a Subset Containing the Best Population-A Bayesian Approach. Ann. Statist. 5 (1977), no. 5, 969--983. doi:10.1214/aos/1176343952.

Export citation