In this paper we focus on the finite-horizon optimality for denumerable continuous-time Markov decision processes, in which the transition and reward/cost rates are allowed to be unbounded, and the optimality is over the class of all randomized history-dependent policies. Under mild reasonable conditions, we first establish the existence of a solution to the finite-horizon optimality equation by designing a technique of approximations from the bounded transition rates to unbounded ones. Then we prove the existence of ε (≥ 0)-optimal Markov policies and verify that the value function is the unique solution to the optimality equation by establishing the analog of the Itô-Dynkin formula. Finally, we provide an example in which the transition rates and the value function are all unbounded and, thus, obtain solutions to some of the unsolved problems by Yushkevich (1978).
"Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates." Adv. in Appl. Probab. 47 (4) 1064 - 1087, December 2015. https://doi.org/10.1239/aap/1449859800