Source: Adv. in Appl. Probab. Volume 38, Number 3
(2006), 643-672.
In 1988 Whittle introduced an important but intractable class of
restless bandit problems which generalise the
multiarmed bandit problems of Gittins by allowing state
evolution for passive projects. Whittle's account deployed a
Lagrangian relaxation of the optimisation problem to develop an
index heuristic. Despite a developing body of evidence (both
theoretical and empirical) which underscores the strong
performance of Whittle's index policy, a continuing challenge to
implementation is the need to establish that the competing
projects all pass an indexability test. In this paper we employ
Gittins' index theory to establish the indexability of
(inter alia) general families of restless bandits which
arise in problems of machine maintenance and stochastic scheduling
problems with switching penalties. We also give formulae for the
resulting Whittle indices. Numerical investigations testify to the
outstandingly strong performance of the index heuristics
concerned.
Full-text: Access denied (no subscription
detected)
We're sorry, but we are unable to provide
you with the full text of this article because we are not able to identify
you as a subscriber.
If you have a personal subscription to
this journal, then please login. If you are already logged in, then you
may need to update your profile to register your subscription.
Read more about accessing full-text
References
Agrawal, R., Hedge, M. and Teneketzis, D. (1988). Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost. IEEE Trans. Automatic Control 33, 899--906.
Mathematical Reviews (MathSciNet):
MR959012
Ansell, P. S., Glazebrook, K. D., Niño-Mora, J. and O'Keeffe, M. (2003). Whittle's index policy for a multi-class queueing system with convex holding costs. Math. Meth. Operat. Res. 57, 21--39.
Asawa, M. and Teneketzis, D. (1996). Multi-armed bandits with switching penalties. IEEE Trans. Automatic Control 41, 328--348.
Banks, J. S. and Sundaram, R. (1994). Switching costs and the Gittins index. Econometrica 62, 687--694.
Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148--177.
Mathematical Reviews (MathSciNet):
MR547241
Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, Chichester.
Mathematical Reviews (MathSciNet):
MR996417
Glazebrook, K. D. (1980). On stochastic scheduling with precedence relations and switching costs. J. Appl. Prob. 17, 1016--1024.
Mathematical Reviews (MathSciNet):
MR587202
Glazebrook, K. D., Mitchell, H. M. and Ansell, P. S. (2005). Index policies for the maintenance of a collection of machines by a set of repairmen. Europ. J. Operat. Res. 165, 267--284.
Glazebrook, K. D., Niño-Mora, J. and Ansell, P. S. (2002). Index policies for a class of discounted restless bandits. Adv. Appl. Prob. 34, 754--774.
Nash, P. (1979). Optimal allocation of resources between research projects. Doctoral Thesis, University of Cambridge.
Niño-Mora, J. (2001). Restless bandits, partial conservation laws and indexability. Adv. Appl. Prob. 33, 76--98.
Niño-Mora, J. (2002). Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach. Math. Program. 93, 361--413.
Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queuing network control. Math. Operat. Res. 24, 293--305.
Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.
Reiman, M. I. and Wein, L. M. (1998). Dynamic scheduling of a two-class queue with setups. Operat. Res. 46, 532--547.
Van Oyen, M. P. and Teneketzis, D. (1994). Optimal stochastic scheduling of forest networks with switching penalties. Adv. Appl. Prob. 26, 474--479.
Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637--648. (Addendum: Adv. Appl. Prob. 23 (1991), 429--430.)
Whittle, P. (1980). Multi-armed bandits and the Gittins index. J. R. Statist. Soc. B 42, 143--149.
Mathematical Reviews (MathSciNet):
MR583348
Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), ed. J. Gani, Applied Probability Trust, Sheffield, pp. 287--298.
Mathematical Reviews (MathSciNet):
MR974588
Whittle, P. (1996). Optimal Control: Basics and Beyond. John Wiley, Chichester.