The Annals of Statistics

Conditions for the Equivalence of Optimality Criteria in Dynamic Programming

James Flynn

Full-text: Open access

Abstract

This paper examines the relationships between optimality criteria which are commonly used for undiscounted, discrete-time, countable state Markovian decision models. One approach, due to Blackwell, is to maximize the expected discounted total return as the discount factor approaches 1. Another, due to Veinott, is to maximize the Cesaro means of the finite horizon expected returns as the horizon tends to infinity. Derman's is to maximize the long-run average gain. Denardo, Miller and Lippman showed that Blackwell's and Veinott's approaches are equivalent for finite state and action spaces. As shown here, that equivalence breaks down when the state space is countable. Also, policies optimal according to Blackwell's or Veinott's approach need not be optimal according to Derman's. On the positive side, fairly weak conditions are given under which Blackwell's and Veinott's criteria imply Derman's, and somewhat stronger conditions under which Blackwell's and Veinott's criteria are equivalent.

Article information

Source
Ann. Statist., Volume 4, Number 5 (1976), 936-953.

Dates
First available in Project Euclid: 12 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1176343590

Digital Object Identifier
doi:10.1214/aos/1176343590

Mathematical Reviews number (MathSciNet)
MR429138

Zentralblatt MATH identifier
0351.93038

JSTOR
links.jstor.org

Subjects
Primary: 49C15
Secondary: 62L99: None of the above, but in this section 90C40: Markov and semi-Markov decision processes 93C55: Discrete-time systems 60J10: Markov chains (discrete-time Markov processes on discrete state spaces) 60J20: Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) [See also 90B30, 91D10, 91D35, 91E40]

Keywords
Dynamic programming Markovian decision process optimality criteria average overtaking criteria average gain discounting small interest rates

Citation

Flynn, James. Conditions for the Equivalence of Optimality Criteria in Dynamic Programming. Ann. Statist. 4 (1976), no. 5, 936--953. doi:10.1214/aos/1176343590. https://projecteuclid.org/euclid.aos/1176343590


Export citation