Abstract
We recast a class of denumerable-state, infinite-action Markov renewal programs with unknown parameters as one-state programs with actions corresponding to stationary policies in the original program. Under suitable conditions we find an adaptive (nonstationary) optimal policy in the sense of maximizing long-run expected reward per unit time.
Citation
Bennett L. Fox. John E. Rolph. "Adaptive Policies for Markov Renewal Programs." Ann. Statist. 1 (2) 334 - 341, March, 1973. https://doi.org/10.1214/aos/1176342370
Information