Abstract
We utilize and develop elements of the recent achievable region account of Gittins indexation by Bertsimas and Niño-Mora to design index-based policies for discounted multi-armed bandits on parallel machines. The policies analyzed have expected rewards which come within an
Citation
K. D. Glazebrook. D. J. Wilkinson. "Index-based policies for discounted multi-armed bandits on parallel machines." Ann. Appl. Probab. 10 (3) 877 - 896, August 2000. https://doi.org/10.1214/aoap/1019487512
Information