Journal of Applied Probability

Blackwell optimality for controlled diffusion processes

Héctor Jasso-Fuentes and Onésimo Hernández-Lerma

Source: J. Appl. Probab. Volume 46, Number 2 (2009), 372-391.

Abstract

In this paper we study $m$-discount optimality (m ≥ -1) and Blackwell optimality for a general class of controlled (Markov) diffusion processes. To this end, a key step is to express the expected discounted reward function as a Laurent series, and then search certain control policies that lexicographically maximize the $m$th coefficient of this series for m = -1,0,1,.... This approach naturally leads to m-discount optimality and it gives Blackwell optimality in the limit as m → ∞.

Primary Subjects: 93E20, 60J60
Keywords: Controlled diffusions; average reward; Laurent series; sensitive discount optimality; Blackwell optimality

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.jap/1245676094
Digital Object Identifier: doi:10.1239/jap/1245676094
Zentralblatt MATH identifier: 1165.93038

References

Akella, R. and Kumar, P. R. (1986). Optimal control of production rate in a failure prone manufacturing system. IEEE Trans. Automatic Control 31, 116--126.
Mathematical Reviews (MathSciNet): MR824906
Digital Object Identifier: doi:10.1109/TAC.1986.1104206
Arapostathis, A., Ghosh, M. K. and Borkar, V. S. (2009). Ergodic Control of Diffusion Processes. To appear.
Blackwell, D. (1962). Discrete dynamic programming. Ann. Math. Statist. 33, 719--726.
Mathematical Reviews (MathSciNet): MR149965
Digital Object Identifier: doi:10.1214/aoms/1177704593
Project Euclid: euclid.aoms/1177704593
Borkar, V. S. and Ghosh, M. K. (1990). Ergodic control of multidimensional diffusions. II. Adaptive control. Appl. Math. Optimization 21, 191--220.
Mathematical Reviews (MathSciNet): MR1019400
Digital Object Identifier: doi:10.1007/BF01445163
Dekker, R. and Hordijk, A. (1992). Recurrence conditions for average and Blackwell optimality in denumerable state Markov decision chains. Math. Operat. Res. 17, 271--289.
Mathematical Reviews (MathSciNet): MR1161154
Digital Object Identifier: doi:10.1287/moor.17.2.271
Zentralblatt MATH: 0773.90088
Dynkin, E. B. (1965). Markov Processes, Vol. 1. Springer, Berlin.
Fort, G. and Roberts, G. O. (2005). Subgeometric ergodicity of strong Markov processes. Ann. Appl. Prob. 15, 1565--1589.
Mathematical Reviews (MathSciNet): MR2134115
Digital Object Identifier: doi:10.1214/105051605000000115
Project Euclid: euclid.aoap/1115137986
Zentralblatt MATH: 1072.60057
Ghosh, M. K. and Marcus, S. I. (1991). Infinite horizon controlled diffusion problems with nonstandard criteria. J. Math. Systems Estim. Control 1, 45--69.
Mathematical Reviews (MathSciNet): MR1151299
Ghosh, M. K., Arapostathis, A. and Marcus, S. I. (1993). Optimal control of switching diffusions with application to flexible manufacturing systems. SIAM J. Control Optimization 31, 1183--1204.
Mathematical Reviews (MathSciNet): MR1233999
Digital Object Identifier: doi:10.1137/0331056
Ghosh, M. K., Arapostathis, A. and Marcus, S. I. (1997). Ergodic control of switching diffusions. SIAM J. Control Optimization 35, 1952--1988.
Mathematical Reviews (MathSciNet): MR1478649
Digital Object Identifier: doi:10.1137/S0363012996299302
Zentralblatt MATH: 0891.93081
Glynn, P. W. and Meyn, S. P. (1996). A Liapounov bound for solutions of the Poisson equation. Ann. Prob. 24, 916--931.
Mathematical Reviews (MathSciNet): MR1404536
Digital Object Identifier: doi:10.1214/aop/1039639370
Project Euclid: euclid.aop/1039639370
Zentralblatt MATH: 0863.60063
Has'minskiĭ, R. Z. (1980). Stochastic Stability of Differential Equations. Sijthoff and Noordhoff, Germantown, Md.
Mathematical Reviews (MathSciNet): MR600653
Hernández-Lerma, O. (1994). Lectures on Continuous-Time Markov Control Processes. Sociedad Matemática Mexicana, Mexico.
Mathematical Reviews (MathSciNet): MR1350072
Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes (Appl. Math. 42). Springer, New York.
Mathematical Reviews (MathSciNet): MR1697198
Hilgert, N. and Hernández-Lerma, O. (2003). Bias optimality versus strong $0$-discount optimality in Markov control processes with unbounded costs. Acta Appl. Math. 77, 215--235.
Mathematical Reviews (MathSciNet): MR1996875
Digital Object Identifier: doi:10.1023/A:1024996308133
Hordijk, A. and Yushkevich, A. A. (2002). Blackwell optimality. In Handbook of Markov Decision Processes (Internat. Ser. Operat. Res. Manag. Sci. 40), eds E. A. Feinberg and A. Shwartz, Kluwer, Boston, MA, pp. 231\nobreakdash--267.
Mathematical Reviews (MathSciNet): MR1887205
Jasso-Fuentes, H. (2007). Infinite-horizon optimal control problems for Markov diffusion processes. Doctoral Thesis, Mathematics Department, CINVESTAV-IPN.
Jasso-Fuentes, H. and Hernández-Lerma, O. (2008). Characterizations of overtaking optimality for controlled diffusion processes. Appl. Math. Optimization 57, 349--369.
Mathematical Reviews (MathSciNet): MR2407317
Digital Object Identifier: doi:10.1007/s00245-007-9025-6
Jasso-Fuentes, H. and Hernández-Lerma, O. (2009). Ergodic control, bias, and sensitive discount optimality for Markov diffusion processes. Stoch. Anal. Appl. 27, 363--385.
Leizarowitz, A. (1988). Controlled diffusion processes on infinite horizon with the overtaking criterion. Appl. Math. Optimization 17, 61--78.
Mathematical Reviews (MathSciNet): MR908939
Digital Object Identifier: doi:10.1007/BF01448359
Leizarowitz, A. (1990). Optimal controls for diffusion in $\mathbbR^d$---min-max max-min formula for the minimal cost growth rate. J. Math. Anal. Appl. 149, 180--209.
Mathematical Reviews (MathSciNet): MR1054802
Digital Object Identifier: doi:10.1016/0022-247X(90)90294-P
Zentralblatt MATH: 0699.49020
Meyn, S. P. and Tweedie, R. L. (1993). Stability of Markovian processes. III. Foster--Lyapunov criteria for continuous-time precesses. Adv. Appl. Prob. 25, 518--548.
Mathematical Reviews (MathSciNet): MR1234295
Digital Object Identifier: doi:10.2307/1427522
Zentralblatt MATH: 0781.60053
Prieto-Rumeau, T. (2006). Blackwell optimality in the class of Markov policies for continuous-time controlled Markov chains. Acta Appl. Math. 92, 77--96.
Mathematical Reviews (MathSciNet): MR2263473
Digital Object Identifier: doi:10.1007/s10440-006-9060-3
Prieto-Rumeau, T. and Hernandez-Lerma, O. (2005). Bias and overtaking equilibria for zero-sum continuous-time Markov games. Math. Meth. Operat. Res. 61, 437--454.
Mathematical Reviews (MathSciNet): MR2225823
Digital Object Identifier: doi:10.1007/s001860400392
Zentralblatt MATH: 1114.91019
Prieto-Rumeau, T. and Hernandez-Lerma, O. (2005). The Laurent series, sensitive discount and Blackwell optimality for continuous-time controlled Markov chains. Math. Meth. Operat. Res. 61, 123--145.
Mathematical Reviews (MathSciNet): MR2120404
Digital Object Identifier: doi:10.1007/s001860400393
Zentralblatt MATH: 1077.93055
Prieto-Rumeau, T. and Hernández-Lerma, O. (2006). Bias optimality for continuous-time controlled Markov chains. SIAM J. Control Optimization 45, 51--73.
Mathematical Reviews (MathSciNet): MR2225297
Digital Object Identifier: doi:10.1137/S036301290343432
Zentralblatt MATH: 1134.93049
Puterman, M. L. (1974). Sensitive discount optimality in controlled one-dimensional diffusions. Ann. Prob. 2, 408--419.
Mathematical Reviews (MathSciNet): MR363619
Digital Object Identifier: doi:10.1214/aop/1176996656
Taylor, H. M. (1976). A Laurent series for the resolvent of a strongly continuous stochastic semi-group. Math. Program. 6, 258--263.
Mathematical Reviews (MathSciNet): MR474521
Veinott, A. F. Jr. (1969). Discrete dynamic programming with sensitive discount optimality criteria. Ann. Math. Statist. 40, 1635--1660.
Mathematical Reviews (MathSciNet): MR256712
Digital Object Identifier: doi:10.1214/aoms/1177697379
Project Euclid: euclid.aoms/1177697379
Veretennikov, A. Y. and Klokov, S. A. (2005). On the subexponential rate of mixing for Markov processes. Theory Prob. Appl. 49, 110--122.
Mathematical Reviews (MathSciNet): MR2141328
Yosida, K. (1995). Functional Analysis (Reprint). Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1336382
Zhu, Q. and Guo, X. (2005). Another set of conditions for strong $n$ ($n=-1,0$) discount optimality in Markov decision processes. Stoch. Anal. Appl. 23, 953--974.
Mathematical Reviews (MathSciNet): MR2158887
Digital Object Identifier: doi:10.1080/07362990500184865
Zentralblatt MATH: 1160.90686

2009 © Applied Probability Trust