Open Access
December, 1968 Sequential Selection of Experiments
K. B. Gray Jr.
Ann. Math. Statist. 39(6): 1953-1977 (December, 1968). DOI: 10.1214/aoms/1177698025

Abstract

The problem of sequential selection of experiments, with fixed and optional stopping, is considered. Conditions are given which allow selection, stopping and terminal action rules to be based on a sequence $\{T_j\}$ of statistics, where $T_j$ is a function of past observations $\mathbf{X}^j = (X_1, \cdots, X_j)$ and experiment selections $\mathbf{E}^j = (E_1, \cdots, E_j)$. Randomized stopping, selection, and terminal action rules are allowed, and all probability distributions are defined by densities relative to $\sigma$-finite measures over Euclidean spaces. Here we give a heuristic description of the principal results for the case of optional stopping. At each time $j$ the random variable $X_j$ is observed and a decision is made to stop or continue. If the procedure is stopped, a terminal action $A$ is taken. If it is continued, an experiment $E_{j+1}$, to be performed at time $j + 1$, is chosen. At time $j$, all decisions are based on $\mathbf{X}^j,\mathbf{E}^j$, the past observations and experiment selections. Upon stopping, and taking action $A$, a loss $L(\theta, A)$, where $\theta$ is the unknown state of nature, is incurred. The sampling cost of stopping at $j$ is $C_j(\theta, \mathbf{X}^j, \mathbf{E}^j)$. Let the random variable $N$ denote the random stopping time. A selection rule $\gamma = (\gamma_0, \gamma_1, \cdots)$ is defined by the sequence of conditional densities $\gamma_j(e_{j+1}\mid\mathbf{x}^j, \mathbf{e}^j)$, a stopping rule $(\mathbb{\Phi} = (\phi_0, \phi_1, \cdots)$ by the probabilities $\phi_j(\mathbf{x}^j,\mathbf{e}^j) = P\{N = j\mid N \geqq j, \mathbf{x}^j,\mathbf{e}^j\}$, and a terminal action rule $\delta = (\delta_0, \delta_1, \cdots)$ by the conditional densities $\delta_j(a\mid\mathbf{x}^j,\mathbf{e}^j)$. Definition of the population densities $f_\theta(x_{j+1}\mid\mathbf{x}^j, \mathbf{e}^{j+1})$ for $j = 0, 1, 2, \cdots$ completely fixes the probability structure. Define $\{T_j\}$ to be parameter sufficient (PARS) if, for $j = 0, 1, 2, \cdots, \operatorname{Dist}_{\theta,\gamma}(\mathbf{X}^j, \mathbf{E}^j\mid T_j)$ is independent of $\theta$ for all $\gamma$ and policy sufficient (POLS) if, for $j = 0, 1, 2, \cdots, \operatorname{Dist}_{\theta,\Phi,\gamma} (T_{j+1}\mid T_j, E_{j+1}, N \geqq j + 1)$ is independent of $\mathbf{\phi}, \mathbf{\gamma}$ for all $\theta$. THEOREM. If $\{T_j\}$ is PARS; then the class of policies $\{\mathbf{\phi}, \mathbf{\gamma}, \mathbf{\delta}^0\}$, where $\delta^0$ is based on $\{T_j\}$, is essentially complete. THEOREM. If $\{T_j\}$ is PARS and POLS, and the sampling cost is of the form $C_j(\theta, T_j)$, then the class of policies $\{\mathbf{\Phi}^0, \mathbf{\gamma}^0, \mathbf{\delta}^0\}$, where $\mathbf{\phi}^0, \mathbf{\gamma}^0, \mathbf{\delta}^0$ are based on $\{T_j\}$, is essentially complete. Conditions are given to aid in the verification of PARS and POLS. The theorems are applied to examples, including versions of the two armed bandit problem.

Citation

Download Citation

K. B. Gray Jr.. "Sequential Selection of Experiments." Ann. Math. Statist. 39 (6) 1953 - 1977, December, 1968. https://doi.org/10.1214/aoms/1177698025

Information

Published: December, 1968
First available in Project Euclid: 27 April 2007

zbMATH: 0187.16202
MathSciNet: MR243690
Digital Object Identifier: 10.1214/aoms/1177698025

Rights: Copyright © 1968 Institute of Mathematical Statistics

Vol.39 • No. 6 • December, 1968
Back to Top