Abstract
Let $y_1, y_2, \cdots$ be a sequence of random variables with known joint distribution. We are allowed to observe the $y$'s sequentially. We must terminate the observation process at some point, and if we stop at time $n$, we receive a reward which is a known function of $y_1, \cdots, y_n$. Our decision to stop at time $n$ is allowed to depend on the observations we have previously made but may not depend on the future, which is still unknown. We are interested in finding stopping rules which maximize our expected terminal reward. More formally, let $(x_n, F_n)_{1 \leqq n}$ be a stochastic sequence on a probability space $(W, F, P)$, i.e., let $(F_n)$ be an increasing sequence of sub-sigma-algebras of $F$ and for each $n \leqq 1$ let $x_n$ be a random variable (rv) measurable $F_n$. In terms of the intuitive background of the preceding paragraph, $F_n = B(y_1, \cdots, y_n)$; and although it is convenient to keep this interpretation in mind, our general results do not depend on it. A stopping rule or stopping variable (sv) is a rv $t$ with values $1, 2, \cdots, +\infty$, such that $P(t < \infty) = 1$ and for each $n \geqq 1 (t = n) \varepsilon F_n. x_t$ is (up to an equivalence) a rv, and if $v = \sup Ex_t$, where the supremum is taken over all sv's such that $Ex_t$ exists, we are interested in answering the following questions: (a) What is $v$? (b) Is there an optimal sv, i.e., one for which $Ex_t$ exists and equals $v$? (c) If there exists an optimal sv, what is it? The problem stated above is not sufficiently well formulated, as the class of sv's $t$ such that $Ex_t$ exists may be vacuous. To avoid this and other uninteresting complications we shall add the assumption that $E|x_n| < \infty, n \geqq 1$. We recall that the essential supremum (e. sup) of a family of rv's $\{q_t, t \varepsilon T\}$ is a rv $Q$ such that (1) $Q \geqq q_t$ a.s. $t \varepsilon T$, and (2) if $Q'$ is any rv such that $Q' \geqq q_t$ a.s., $t \varepsilon T$, then $Q' \geqq Q$ a.s. It is known that the essential supremum of a family of rv's always exists and can be assumed to be the supremum of some countable subfamily (e.g., [12], p. 44). Let $C_n$ be the class of all sv's $t$ such that $P(t \geqq n) = 1$ and $EX^-_t < \infty$. Let $f_n = e. \sup_{t \varepsilon C_n} E(x_t \mid F_n), v_n = \sup_{C_n} Ex_t$. It is known (Theorem 2 of [3]) that if $v < \infty$ and an optimal rule exists, then $s = \text{first} n \geqq 1$ such that $x_n = f_n$ is an optimal rule. For this and various other reasons which will become apparent, e.g., Theorem 1 below, it is desirable to have a constructive method for computing the $f_n$. The technique of backward induction and taking limits, originating with [1] and described in Theorem 2 below, achieves the desired result under certain conditions (see Theorem 2 of [4] for a general statement of these conditions). The central theorem of Section 2 provides completely general methods for computing the $f_n$. Although it seems unlikely that one would ever find it desirable to carry out these computations, there are, nevertheless, several interesting applications of the results to the theory of optimal stopping rules, and it is these applications which concern us throughout the remainder of this paper. In the course of these investigations we find it convenient to introduce the notion of an extended sv, i.e., we drop the requirement that $t$ be finite with probability one while defining $x_\infty$ to be $\lim \sup x_n$. We show that $\bar f_n = f_n$, where, relative to the class of extended sv's, $\bar f_n$ is defined analogously to $f_n$. We utilize extended sv's as a technical device within the framework of the usual theory and give examples which illustrate the inherent value of these sv's. In Section 3 we define the Markov case. We show that by paying proper attention to the Markovian structure of many stopping rule problems we are able to simplify somewhat the general theory and to give relatively simple descriptions of optimal rules when they exist. We also define randomized sv's and show that randomization does not increase $v$. We then apply this result to prove the monotonocity and continuity of $v = v(p)$ in the case where $x_n$ is the proportion of heads in $n$ independent tosses of a coin having probability $p$ of heads on each toss.
Citation
David Oliver Siegmund. "Some Problems in the Theory of Optimal Stopping Rules." Ann. Math. Statist. 38 (6) 1627 - 1640, December, 1967. https://doi.org/10.1214/aoms/1177698596
Information