Open Access
2019 Nonlinear randomized urn models: a stochastic approximation viewpoint
Sophie Laruelle, Gilles Pagès
Electron. J. Probab. 24: 1-47 (2019). DOI: 10.1214/19-EJP312

Abstract

This paper extends the link between stochastic approximation ($SA$) theory and randomized urn models developed in [32], and their applications to clinical trials introduced in [2, 3, 4]. We no longer assume that the drawing rule is uniform among the balls of the urn (which contains $d$ colors), but can be reinforced by a function $f$. This is a way to model risk aversion. Firstly, by considering that $f$ is concave or convex and by reformulating the dynamics of the urn composition as an $SA$ algorithm with remainder, we derive the $a.s.$ convergence and the asymptotic normality (Central Limit Theorem, $CLT$) of the normalized procedure by calling upon the so-called $ODE$ and $SDE$ methods. An in depth analysis of the case $d=2$ exhibits two different behaviors: a single equilibrium point when $f$ is concave, and, when $f$ is convex, a transition phase from a single attracting equilibrium to a system with two attracting and one repulsive equilibrium points. The last setting is solved using results on non-convergence toward noisy and noiseless “traps” in order to deduce the $a.s.$ convergence toward one of the attracting points. Secondly, the special case of a Pólya urn (when the addition rule is the $I_{d}$ matrix) is analyzed, still using result from $SA$ theory about “traps”. Finally, these results are used to solve another urn model with a more natural nonlinear drawing rule and we conclude by an example of application to optimal asset allocation in Finance.

Citation

Download Citation

Sophie Laruelle. Gilles Pagès. "Nonlinear randomized urn models: a stochastic approximation viewpoint." Electron. J. Probab. 24 1 - 47, 2019. https://doi.org/10.1214/19-EJP312

Information

Received: 23 May 2018; Accepted: 29 April 2019; Published: 2019
First available in Project Euclid: 18 September 2019

zbMATH: 07107405
MathSciNet: MR4017116
Digital Object Identifier: 10.1214/19-EJP312

Subjects:
Primary: 62E20 , 62L05 , 62L20
Secondary: 62F12 , 62P10

Keywords: asymptotic normality , bandit algorithms , extended Pólya urn models , non-homogeneous generating matrix , Reinforcement , stochastic approximation , strong consistency

Vol.24 • 2019
Back to Top