Open Access
Translator Disclaimer
November, 1993 Indices for Families of Competing Markov Decision Processes with Influence
K. D. Glazebrook
Ann. Appl. Probab. 3(4): 1013-1032 (November, 1993). DOI: 10.1214/aoap/1177005270


Nash obtained an important extension to the classical theory of Gittins indexation when he demonstrated that index policies were optimal for a class of multiarmed bandit problems with a multiplicatively separable reward structure. We characterise the relevant indices (herein referred to as Nash indices) as equivalent retirement rewards/penalties for appropriately defined maximisation/minimisation problems. We also give a condition which is sufficient to guarantee the optimality of index policies for a Nash-type model in which each constituent bandit has its own decision structure.


Download Citation

K. D. Glazebrook. "Indices for Families of Competing Markov Decision Processes with Influence." Ann. Appl. Probab. 3 (4) 1013 - 1032, November, 1993.


Published: November, 1993
First available in Project Euclid: 19 April 2007

zbMATH: 0795.90084
MathSciNet: MR1241032
Digital Object Identifier: 10.1214/aoap/1177005270

Primary: 90C40

Keywords: Gittins index , Markov decision process , optimal policy , stopping time

Rights: Copyright © 1993 Institute of Mathematical Statistics


Vol.3 • No. 4 • November, 1993
Back to Top