The Annals of Statistics

Markov Decision Processes with a New Optimality Criterion: Discrete Time

Stratton C. Jaquette

Full-text: Open access

Abstract

Standard finite state and action discrete time Markov decision processes with discounting are studied using a new optimality criterion called moment optimality. A policy is moment optimal if it lexicographically maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even). This criterion is equivalent to being a little risk adverse. It is shown that a stationary policy is moment optimal by examining the negative of the Laplace transform of the total return random variable. An algorithm to construct all stationary moment optimal policies is developed. The algorithm is shown to be finite.

Article information

Source
Ann. Statist., Volume 1, Number 3 (1973), 496-505.

Dates
First available in Project Euclid: 12 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1176342415

Digital Object Identifier
doi:10.1214/aos/1176342415

Mathematical Reviews number (MathSciNet)
MR378839

Zentralblatt MATH identifier
0259.90054

JSTOR
links.jstor.org

Citation

Jaquette, Stratton C. Markov Decision Processes with a New Optimality Criterion: Discrete Time. Ann. Statist. 1 (1973), no. 3, 496--505. doi:10.1214/aos/1176342415. https://projecteuclid.org/euclid.aos/1176342415


Export citation