Asymptotically Optimum Properties of Certain Sequential Tests

Seok Pin Wong

doi:10.1214/aoms/1177698250

August, 1968 Asymptotically Optimum Properties of Certain Sequential Tests

Seok Pin Wong

Ann. Math. Statist. 39(4): 1244-1263 (August, 1968). DOI: 10.1214/aoms/1177698250

Abstract

Let $X_1, X_2, \cdots$ be independent and identically distributed random variables whose common distribution is of the one-parameter Koopman-Darmois type, i.e., the density function of $X_1$ relative to some $\sigma$-finite nondegenerate measure of $F$ on the real line can be written as $f(x, \theta) = \exp (\theta x - b(\theta))$, where $b(\theta)$ is some real function of the parameter $\theta$. Consider the hypotheses $H_0 = \{\theta \leqq \theta_0\}$ and $H_1 = \{\theta \geqq \theta_1\}$ where $\theta_0 < \theta_1$ and $\theta_0, \theta_1$ are in $\Omega$, the natural parameter space. We want to decide sequentially between the two hypotheses. Suppose $l(\theta)$ is the loss for making a wrong decision when $\theta$ is the true parameter and assume $0 \leqq l(\theta) \leqq 1$ for all $\theta$ and $l(\theta) = 0$ if $\theta$ is in $(\theta_0, \theta_1)$, i.e., $(\theta_0, \theta_1)$ is an indifference zone. Let $c$ be the cost of each observation. It is sufficient to let the decision depend on the sequence $(n, S_n), n \geqq 1$, where $S_n = X_1 + \cdots + X_n$. We shall consider the observed values of $(n, S_n)$ as points in a $(u, v)$ plane. Then, for any test, the region in the $(u, v)$ plane where sampling does not stop is called the continuation region of the test. A test and its continuation region will be denoted by the same symbol. Schwarz [4] introduced an a priori distribution $W$ and studied the asymptotic shape of the Bayes continuation region, say $B_W(c)$, as $c \rightarrow 0$. He showed that $B_W(c)/\ln c^{-1}$ approaches, in a certain sense, a region $B_W$ that depends on $W$ only through its support. Whereas Schwarz's work is concerned with Bayes tests, in this paper the main interest is in characteristics of sequential tests as a function of $\theta$. In particular, it is desired to minimize the expected sample size (uniformly in $\theta$ if possible) subject to certain bounds on the error probabilities. Our approach, like Schwarz's, is asymptotic, as $c \rightarrow 0$. It turns out that an asymptotically optimum test--in the sense indicated above, is $B_W \ln c^{-1}$ if $W$ is a measure that dominates Lebesgue measure. Such a measure will be denoted by $L$ (for Lebesgue dominating) from now on. Thus, Bayes tests, as a tool, will play a significant role in this paper. In order to prove the optimum characteristic of $B_L \ln c^{-1}$, some other results, of interest in their own right, are established. For any $W$ satisfying certain conditions that will be given later, we show that the stopping variable $N(c)$ of $B_W(c)$ approaches $\infty$ a.e. $P_\theta$ for every $\theta$ in $\Omega$. This result together with Schwarz's result that $B_W(c) \ln c^{-1}$ approaches a finite region, leads to the following results: (i) for $B_W(c), E_\theta N(c)/\ln c^{-1}$ tends to a constant for each $\theta$ in $\Omega$ and (ii) the same is true for the stopping variable of $B_W \ln c^{-1}$. Furthermore, it is shown that for $B_L \ln c^{-1}$ the error probabilities tend to zero faster than $c \ln c^{-1}$. Consequently, the contributions of the expected sample sizes of both $B_L \ln c^{-1}$ and $B_L(c)$ to their integrated risks, over any $L$-measure, approach 100%. Moreover $B_L \ln c^{-1}$ is asymptotically Bayes. The last result can be shown without (i) since it is sufficient to show (ii) and to apply the same argument used by Kiefer and Sacks [3] in the proof of their Theorem 1. But we show (i) because of its intrinsic interest and present a different proof using (i). Kiefer and Sacks assumed a more general distribution for $X_1$, constructed a procedure $\delta^{'I}_c$ and showed that it is asymptotically Bayes. Our $B_L \ln c^{-1}$ is somewhat more explicit than their $\delta_c'I$. We would also like to point out that an example of $B_L \ln c^{-1}$, when the distribution is normal, is very briefly discussed in their work. We shall restrict ourselves to a priori distribution $W$ for which $\sup (\mod W)H_0 = \theta_0, \inf (\mod W)H_1 = \theta_1$ and $0 < W(H_0 \cup H_1) < 1$. The phrase "for any $W"$ or "for every $W$" is to be understood in that sense. Any Lebesgue dominating measure satisfies these conditions and also the following type of $W$ that will be used: the support of $W$ consists of $\theta_0, \theta_1$ and a third point $\theta^\ast, \theta_0 < \theta^\ast < \theta_1$. Such a $W$ will be called a $\theta^\ast$-measure, and the corresponding $B_W$ denoted by $B_{\theta^\ast}$. From Schwarz's equations for $B_W$ it follows readily that $B_L \subset B_W$ for every $W$. In particular, $B_L \subset B_{\theta^\ast}$. As a consequence, the statement about the error probabilities as well as others concerning $B_L \ln c^{-1}$ in the last paragraph, remain true when $L$ is replaced by $\theta^\ast$ or any $W$. Those geometric characteristics will be dealt with in Section 2. We shall also show there that $\partial B_{\theta^\ast}$, the boundary of $B_{\theta^\ast}$ (which consists of line segments), is tangent to $\partial B_L$ at some point, and that if $\theta^\ast$ is such that $b'(\theta^\ast) = (b(\theta_1) - b(\theta_0))/(\theta_1 - \theta_0)$ then $\max_{(u, \nu)\text{in}B_L} u = \max_{u,v)\text{in}B_\theta^\ast} u$. Let the ray through the origin and with slope equal to $E_\theta X_1$ intersect $\partial B_L$ at $(m(\theta), m(\theta)E_\theta X_1)$. In Section 3, after proving $\lim_{c\rightarrow 0} N(c) = \infty$ a.e. $P_\theta$, we show $\lim_{c\rightarrow 0} N(c)/\ln c^{-1} = m(\theta)$ a.e. $P_\theta$ and $\lim_{c\rightarrow 0} E_\theta N(c)/\ln c^{-1} = m(\theta)$. It is shown in Section 4 that $\sup_{\theta \text{in} H_0 \mathbf{\cup} H_1} P_\theta$ (error $\mid B_L \ln c^{-1}) = o(c \ln c^{-1})$. The main results are given in Section 5. We first show that after dividing by $c \ln c^{-1}$, the difference of the integrated risks of $B_L \ln c^{-1}$ and $B_W(c)$, for any $W$, tends to zero. It follows from this result that $B_L \ln c^{-1}$ asymptotically minimizes the maximum (over $\theta$ in $\Omega$) expected sample size in $\mathscr{F}(c)$, a family of tests whose error probabilities are bounded by $\max_{i=0,1} P_{\theta_i}$ (error $\mid B_L \ln c^{-1}$). The precise statement is given in Theorem 5.1. A sharper result under a stronger hypothesis is given in Theorem 5.2 which states that $B_L \ln c^{-1}$ asymptotically minimizes the expected sample size $E_\theta N$ for each $\theta, \theta_0 < \theta < \theta_1$, among all procedures of $\mathscr{F}(c)$ for which $E_{\theta_0}N/\ln c^{-1}$ and $E_{\theta_1}N/\ln c^{-1}$ are bounded in $c$.

Citation

Download Citation

Seok Pin Wong. "Asymptotically Optimum Properties of Certain Sequential Tests." Ann. Math. Statist. 39 (4) 1244 - 1263, August, 1968. https://doi.org/10.1214/aoms/1177698250