The Annals of Probability

The Optimal Reward Operator in Dynamic Programming

D. Blackwell, D. Freedman, and M. Orkin

Full-text: Open access

Abstract

Consider a dynamic programming problem with analytic state space $S$, analytic constraint set $A$, and semi-analytic reward function $r(x, P, y)$ for $(x, P)\in A$ and $y\in S$: namely, $\{r > a\}$ is an analytic set for all $a$. Let $Tf$ be the optimal reward in one move, with the modified reward function $r(x, P, y) + f(y)$. The optimal reward in $n$ moves is shown to be $T^n0$, a semi-analytic function on $S$. It is also shown that for any $n$ and positive $\varepsilon$, there is an $\varepsilon$-optimal strategy for the $n$-move game, measurable on the $\sigma$-field generated by the analytic sets.

Article information

Source
Ann. Probab. Volume 2, Number 5 (1974), 926-941.

Dates
First available in Project Euclid: 19 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aop/1176996558

Digital Object Identifier
doi:10.1214/aop/1176996558

Mathematical Reviews number (MathSciNet)
MR359818

Zentralblatt MATH identifier
0318.49021

JSTOR
links.jstor.org

Subjects
Primary: 49C99
Secondary: 60K99: None of the above, but in this section 90C99: None of the above, but in this section 28A05: Classes of sets (Borel fields, $\sigma$-rings, etc.), measurable sets, Suslin sets, analytic sets [See also 03E15, 26A21, 54H05]

Keywords
Dynamic programming optimal reward optimal strategy analytic sets gambling

Citation

Blackwell, D.; Freedman, D.; Orkin, M. The Optimal Reward Operator in Dynamic Programming. Ann. Probab. 2 (1974), no. 5, 926--941. doi:10.1214/aop/1176996558. https://projecteuclid.org/euclid.aop/1176996558


Export citation