The Annals of Applied Probability

Continuous-time controlled Markov chains

Xianping Guo and Onésimo Hernández-Lerma

Full-text: Open access

Abstract

This paper concerns studies on continuous-time controlled Markov chains, that is, continuous-time Markov decision processes with a denumerable state space, with respect to the discounted cost criterion. The cost and transition rates are allowed to be unbounded and the action set is a Borel space. We first study control problems in the class of deterministic stationary policies and give very weak conditions under which the existence of $\varepsilon$-optimal ($\varepsilon\geq 0)$ policies is proved using the construction of a minimum Q-process. Then we further consider control problems in the class of randomized Markov policies for (1) regular and (2) nonregular Q-processes. To study case (1), first we present a new necessary and sufficient condition for a nonhomogeneous Q-process to be regular. This regularity condition, together with the extended generatorof a nonhomogeneous Markov process, is used to prove the existence of $\varepsilon$-optimal stationary policies. Our results for case (1) are illustrated by a Schlögl model with a controlled diffusion. For case (2), we obtain a similar result using Kolmogorov's forward equation for the minimum Q-process and we also present an example in which our assumptions are satisfied, but those used in the previous literature fail to hold.

Article information

Source
Ann. Appl. Probab., Volume 13, Number 1 (2003), 363-388.

Dates
First available in Project Euclid: 16 January 2003

Permanent link to this document
https://projecteuclid.org/euclid.aoap/1042765671

Digital Object Identifier
doi:10.1214/aoap/1042765671

Mathematical Reviews number (MathSciNet)
MR1952002

Zentralblatt MATH identifier
1049.60067

Subjects
Primary: 93E20: Optimal stochastic control
Secondary: 60J27: Continuous-time Markov processes on discrete state spaces 90C40: Markov and semi-Markov decision processes

Keywords
Nonhomogeneous continuous-time Markov chains controlled Q-processes unbounded cost and transition rates discounted criterion optimal stationary policies

Citation

Guo, Xianping; Hernández-Lerma, Onésimo. Continuous-time controlled Markov chains. Ann. Appl. Probab. 13 (2003), no. 1, 363--388. doi:10.1214/aoap/1042765671. https://projecteuclid.org/euclid.aoap/1042765671


Export citation

References

  • [1] ANDERSON, W. J. (1991). Continuous Time Markov Chains. Springer, New York.
  • [2] BATHER, J. (1976). Optimal stationary policies for denumerable Markov chains in continuoustime. Adv. in Appl. Probab. 8 114-158.
  • [3] BERTSEKAS, D. P. (1987). Dy namic Programming: Deterministic and Stochastic Models. Prentice-Hall, Englewood Cliffs, NJ.
  • [4] CAVAZOS-CADENA, R. and GAUCHERAND, E. (1996). Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence. J. Appl. Probab. 33 986-1002.
  • [5] CHEN, M. F. (1990). On three classical problems for Markov chains with continuous time parameters. J. Appl. Probab. 28 305-320.
  • [6] CHUNG, K. L. (1960). Markov Chains with Stationary Transition Probabilities. Springer, Berlin.
  • [7] DONG, Z. Q. (1979). Continuous time Markov decision programming with average reward criterion-countable state and action space. Sci. Sinica SP(II) 131-148.
  • [8] FEINBERG, E. A. (1998). Continuous time discounted jump Markov decision processes: A discrete-event approach. Preprint.
  • [9] FELLER, W. (1940). On the integro-differential equations of purely discontinuous Markoff processes. Trans. Amer. Math. Soc. 48 488-515.
  • [10] GUO, X. P. and LIU, K. (2001). A note on optimality conditions for continuous-time Markov decision processes with average cost criterion. IEEE Trans. Automat. Control 46 1984- 1989.
  • [11] GUO, X. P. and ZHU, W. P. (2002). Optimality conditions for CTMDP with average cost criterion. In Markov Processes and Controlled Markov Chains (Z. T. Hou, J. A. Filar and A. Y. Chen, eds.) Chap. 10. Kluwer, Dordrecht.
  • [12] GUO, X. P. and ZHU, W. P. (2002). Denumerable-state continuous-time Markov decision processes with unbounded transition and reward rates under the discounted criterion. J. Appl. Probab. 39 233-250.
  • [13] HAVIV, M. and PUTERMAN, M. L. (1998). Bias optimality in controlled queuing sy stems. J. Appl. Probab. 35 16-150.
  • [14] HERNÁNDEZ-LERMA, O. (1994). Lectures on Continuous-time Markov Control Processes. Sociedad Matemática Mexicana, México City.
  • [15] HEy MAN, D. P. and SOBEL, M. J. (1984). Stochastic Models in Operations Research 2. McGraw-Hill, New York.
  • [16] HOU, Z. T. (1994). The Q-matrix Problems on Markov Processes. Science and Technology Press of Hunan, Changsha, China (in Chinese).
  • [17] HOU, Z. T. and GUO, X. P. (1998). Markov Decision Processes. Science and Technology Press of Hunan, Changsha, China (in Chinese).
  • [18] KAKUMANU, P. (1971). Continuously discounted Markov decision model with countable state and action spaces. Ann. Math. Statist. 42 919-926.
  • [19] LEFÈVRE, C. (1981). Optimal control of a birth and death epidemic process. Oper. Res. 29 971-982.
  • [20] LEWIS, M. E. and PUTERMAN, M. (2001). A probabilistic analysis of bias optimality in unichain Markov decision processes. IEEE Trans. Automat. Control 46 96-100.
  • [21] LEWIS, M. E. and PUTERMAN, M. (2000). A note on bias optimality in controlled queueing sy stems. J. Appl. Probab. 37 300-305.
  • [22] MILLER, R. L. (1968). Finite state continuous time Markov decision processes with an infinite planning horizon. J. Math. Anal. Appl. 22 552-569.
  • [23] PUTERMAN, M. L. (1994). Markov Decision Processes. Wiley, New York.
  • [24] SCHLÖGL, F. (1972). Chemical reaction models for phase transition. Z. Phy s. 253 147-161.
  • [25] SENNOTT, L. I. (1999). Stochastic Dy namic Programming and the Control of Queueing Sy stems. Wiley, New York.
  • [26] SERFOZO, R. (1981). Optimal control of random walks, birth and death processes, and queues. Adv. in Appl. Probab. 13 61-83.
  • [27] SONG, J. S. (1987). Continuous time Markov decision programming with nonuniformly bounded transition rates. Sci. Sinica 12 1258-1267 (in Chinese).
  • [28] TIJMS, H. C. (1994). Stochastic Models: An Algorithmic Approach. Wiley, Chichester.
  • [29] WALRAND, J. (1988). An Introduction to Queueing Networks. Prentice-Hall, Englewood Cliffs, NJ.
  • [30] YUSHKEVICH, A. A. and FEINBERG, E. A. (1979). On homogeneous Markov model with continuous time and finite or countable state space. Theory Probab. Appl. 24 156-161.