A Markov Stopping Problem for Which no Entry Time is $\epsilon$-Optimal

Howard M. Taylor

doi:10.1214/aoms/1177696708

December, 1970 A Markov Stopping Problem for Which no Entry Time is $\epsilon$-Optimal

Howard M. Taylor

Ann. Math. Statist. 41(6): 2105-2112 (December, 1970). DOI: 10.1214/aoms/1177696708

Abstract

Let $(X(t): t \geqq 0)$ be a Markov process with lifetime $\zeta$, let $g$ be a bounded continuous nonnegative function on the state space of the process, and let $P_x$ (respectively, $E_x$) be the probability measure (respectively, expectation operator) associated with paths starting at $x$. For any (extended real-valued) Markov time $T$ let $f(x, T) = \int_{T < \zeta} g(X(T))dP_x$ and let $f(x) = \sup_Tf(x,T)$. For any nonnegative $\varepsilon$, a Markov time $T^\ast$ is called (i) $\varepsilon$-optimal at $x$ if $f(x, T^\ast) \geqq f(x) - \varepsilon$; (ii) optimal at $x$ if $f(x, T^\ast) \geqq f(x)$; (iii) $\varepsilon$-optimal if $f(x, T^\ast) \geqq f(x) - \varepsilon$ for all $x$; and (iv) optimal if $f(x, T^\ast) \geqq f(x)$ for all $x$. We interpret $g(X(T))$ as a reward associated with stopping at time $T$ in state $X(T)$ and we are searching for stopping rules or Markov times $T^\ast$ which maximize or nearly maximize the expected value of this reward. Since the process is Markov and the reward depends only on the state in which one stops and not on the time nor the previous process history, one would anticipate that an optimal stopping time (provided one exists) would be of the form "Continue as long as $f(X(t)) > g(X(t))$ and stop when first $f(X(t)) \leqq g(X(t))$." That is, at time $t$ in state $X(t) = x$, one continues if the optimal reward from continuing $f(x)$ exceeds the reward from stopping $g(x)$; otherwise, one stops. This reasoning leads one to hope that the search for optimal rules can be restricted without loss to rules specified by a partition of the state space into two sets, one of continuation states and one of stopping states. Thus we ask under what conditions \begin{equation*}\tag{1} f(x) = \sup f(x, T(\Gamma))\end{equation*} where the supremum is over all appropriately measurable sets $\Gamma$ and $T(\Gamma)$ is the process entry time to $\Gamma$: \begin{align*} \tag{2} T(\Gamma) &= \infty\quad \text{if} X(t) \notin \Gamma \text{for all} t \geqq 0; \\ &= \inf\{t:t \geqq 0 \text{and} X(t)\in \Gamma\},\quad \text{otherwise}.\end{align*} Dynkin (1963) shows that if $X$ is a standard process then for $\varepsilon \geqq 0, \Gamma_\varepsilon = \{x:f(x) - \varepsilon \leqq g(x)\}$ is closed in the fine or intrinsic topology and for $\varepsilon > 0, T(\Gamma_\varepsilon)$ is $\varepsilon$-optimal, so that in particular (1) holds. When $g$ is unbounded, Dynkin (1968) interprets a result of Chow and Robbins (1967) to state that for $\varepsilon \geqq 0$, if $P_x \lbrack T(\Gamma_\varepsilon) < \infty \rbrack = 1$ for all $x$, then $T(\Gamma_\varepsilon)$ is $\varepsilon$-optimal. Taylor (1968) shows that if in addition the transition semigroup of the process is Feller (leaves invariant the space of bounded continuous functions) then $\Gamma_0$ is closed, and if there exists an optimal Markov time, then $T(\Gamma_0)$ is optimal at all points $x$ for which $f(x) < \infty$. In this paper we present a Markov process on continuous path space and a bounded continuous nonnegative $g$ such that for some positive $\varepsilon$ and some point $x$, no entry time is $\varepsilon$-optimal at $x$. It follows that no entry time can satisfy any of the other, more stringent, criteria for optimality. Our process is Markov but not strong Markov (hence, not Feller) and was suggested to me by a related example I learned from David Freedman (1969). Our construction is based on a transformation by a singular continuous strictly increasing function. A local time transformation serves the same purpose in Freedman's example. The idea behind the example is quite simple, but the details are many and tedious and tend to obscure the basic picture, which is this: Let $y(t)$ be a Brownian motion process starting at the origin. Our process $x(t)$ is a distorted version of $y(t)$. In particular, begin with $x(t) = y(t)$ and continue until first $y(t)$ hits $+1$. Beginning then, distort the process by setting $x(t) = G^{-1}\lbrack y(t) \rbrack$, where $G$ is strictly increasing, continuous and $G(1) = -G(-1) = 1$. Continue until $y(t)$ next reaches $-1$, at which time, revert to $x(t) = y(t)$. Repeat, again waiting until $y(t)$ hits $+1$, then switching to $x(t) = G^{-1}\lbrack y(t)\rbrack$, and so on. For the $x(t)$ process it is clear that a good decision on whether or not to stop should be based in part on whether currently $x(t) = y(t)$ or $x(t) = G^{-1}\lbrack y(t) \rbrack$, or equivalently, whether the process is on a $-1$ to $+1$ section, or on a $+1$ to $-1$ section. In particular, entry times cannot be good because they do not include this information. Granted, to nail down an example, a sufficiently simple stopping problem must be chosen so that a variety of calculations can be made, but it is intuitively clear why entry times cannot be good and how to improve them. This part of our example is presented in Section 3. It is not clear, however, that we can distort the Brownian motion as we have described and yet maintain the Markov property. Here is where we require $G$ to be singular, so that $G^{-1}$ carries a set of full Lebesgue measure into a set, call it $\Lambda$, which is Lebesgue null. At a fixed time $t, y(t)$ is not in $\Lambda$ with probability one, since $\Lambda$ is Lebesgue null. Thus, if we observe $x(t)$ in $\Lambda$ we may infer, with probability one, that $x(t) = G^{-1}\lbrack y(t) \rbrack$. Similarly, still for fixed $t$, if we observe $x(t)$ not in $\Lambda$, we infer, with probability one, that $x(t) = y(t)$. This feature preserves the Markov property. A careful development of this idea is given in Section 2.

Citation

Download Citation

Howard M. Taylor. "A Markov Stopping Problem for Which no Entry Time is $\epsilon$-Optimal." Ann. Math. Statist. 41 (6) 2105 - 2112, December, 1970. https://doi.org/10.1214/aoms/1177696708