Open Access
March, 1974 Averaging vs. Discounting in Dynamic Programming: a Counterexample
James Flynn
Ann. Statist. 2(2): 411-413 (March, 1974). DOI: 10.1214/aos/1176342678

Abstract

We consider countable state, finite action dynamic programming problems with bounded rewards. Under Blackwell's optimality criterion, a policy is optimal if it maximizes the expected discounted total return for all values of the discount factor sufficiently close to 1. We give an example where a policy meets that optimality criterion, but is not optimal with respect to Derman's average cost criterion. We also give conditions under which this pathology cannot occur.

Citation

Download Citation

James Flynn. "Averaging vs. Discounting in Dynamic Programming: a Counterexample." Ann. Statist. 2 (2) 411 - 413, March, 1974. https://doi.org/10.1214/aos/1176342678

Information

Published: March, 1974
First available in Project Euclid: 12 April 2007

zbMATH: 0276.49019
MathSciNet: MR368791
Digital Object Identifier: 10.1214/aos/1176342678

Subjects:
Primary: 49C15
Secondary: 60J10 , 60J20 , 62L99 , 90C40 , 93C55

Keywords: average cost criteria , discounting , dynamic programming , Markov decision process

Rights: Copyright © 1974 Institute of Mathematical Statistics

Vol.2 • No. 2 • March, 1974
Back to Top