Open Access
May, 1995 Levy Bandits: Multi-Armed Bandits Driven by Levy Processes
Haya Kaspi, Avi Mandelbaum
Ann. Appl. Probab. 5(2): 541-565 (May, 1995). DOI: 10.1214/aoap/1177004777

Abstract

Levy bandits are multi-armed bandits driven by Levy processes. As anticipated from existing research, Levy bandits are optimally controlled by an index strategy: One can associate with each arm an index function of its state, and optimal strategies are those that allocate time to arms whose states have the largest index. Furthermore, the index function of an arm is calculated independently of the other arms, and the optimal reward can be expressed in terms of the indices. Somewhat less anticipated, however, is the fact that the index function of an arm, driven by a Levy process, has a representation in terms of the decreasing ladder sets and the exit system of its Levy driver. Moreover, the Wiener-Hopf factorization of the Levy exponents of an arm can be used to obtain the characteristic function of some excursion law, through which the index of the arm is defined. We use this factorization to calculate explicitly index functions and optimal rewards of some interesting Levy bandits, rediscovering along the way that local time naturally quantifies switching in continuous time.

Citation

Download Citation

Haya Kaspi. Avi Mandelbaum. "Levy Bandits: Multi-Armed Bandits Driven by Levy Processes." Ann. Appl. Probab. 5 (2) 541 - 565, May, 1995. https://doi.org/10.1214/aoap/1177004777

Information

Published: May, 1995
First available in Project Euclid: 19 April 2007

zbMATH: 0830.60065
MathSciNet: MR1336882
Digital Object Identifier: 10.1214/aoap/1177004777

Subjects:
Primary: 60J30
Secondary: 60G40 , 60J55

Keywords: Excursions , Levy processes , Local time , multiarmed bandits , multiparameter processes , optional increasing path , Wiener-Hopf factorization

Rights: Copyright © 1995 Institute of Mathematical Statistics

Vol.5 • No. 2 • May, 1995
Back to Top