Abstract
We give a short new proof of the existence of optimal solutions to a continuous time formulation of the two-armed bandit problem, using a new topological embedding of the set of randomized optional increasing paths. We do not make any hypothesis on the two-parameter filtration, other than completeness and right-continuity.
Citation
Robert C. Dalang. "Randomization in the Two-Armed Bandit Problem." Ann. Probab. 18 (1) 218 - 225, January, 1990. https://doi.org/10.1214/aop/1176990946
Information