## Advances in Applied Probability

- Adv. in Appl. Probab.
- Volume 45, Number 1 (2013), 51-85.

### Monotone policies and indexability for bidirectional restless bandits

K. D. Glazebrook, D. J. Hodge, and C. Kirkbride

#### Abstract

Motivated by a wide range of applications, we consider a development of
Whittle's restless bandit model in which project activation requires a
state-dependent amount of a key resource, which is assumed to be
available at a constant rate. As many projects may be activated at each
decision epoch as resource availability allows. We seek a policy for
project activation within resource constraints which minimises an
aggregate cost rate for the system. Project indices derived from a
Lagrangian relaxation of the original problem exist provided the
structural requirement of indexability is met. Verification of this
property and derivation of the related indices is greatly simplified
when the solution of the Lagrangian relaxation has a state monotone
structure for each constituent project. We demonstrate that this is
indeed the case for a wide range of *bidirectional* projects in
which the project state tends to move in a different direction when it
is activated from that in which it moves when passive. This is natural
in many application domains in which activation of a project
ameliorates its condition, which otherwise tends to deteriorate or
deplete. In some cases the state monotonicity required is related to
the structure of state transitions, while in others it is also related
to the nature of costs. Two numerical studies demonstrate the value of
the ideas for the construction of policies for dynamic resource
allocation, most especially in contexts which involve a large number of
projects.

#### Article information

**Source**

**Dates**

First available in Project Euclid: 15 March 2013

**Permanent link to this document**

https://projecteuclid.org/euclid.aap/1363354103

**Digital Object Identifier**

doi:10.1239/aap/1363354103

**Mathematical Reviews number (MathSciNet)**

MR3077541

**Zentralblatt MATH identifier**

1274.90473

**Subjects**

Primary: 90C40: Markov and semi-Markov decision processes

Secondary: 49L20: Dynamic programming method 90C39: Dynamic programming [See also 49L20] 49M20: Methods of relaxation type

**Keywords**

asset management Gittins index indexability inventory management Lagrangian relaxation machine maintenance monotone policy stochastic dynamic programming restless bandit Whittle index

#### Citation

Glazebrook, K. D.; Hodge, D. J.; Kirkbride, C. Monotone policies and indexability for bidirectional restless bandits. Adv. in Appl. Probab. 45 (2013), no. 1, 51--85. doi:10.1239/aap/1363354103. https://projecteuclid.org/euclid.aap/1363354103