Abstract
As does Woodroofe, we consider a Bayesian sequential allocation between two treatments that incorporates a covariate. The goal is to maximize the total discounted expected reward from an infinite population of patients. Although our model is more general than Woodroofe's, we are able to duplicate his main result: The myopic rule is asymptotically optimal.
Citation
Jyotirmoy Sarkar. "One-Armed Bandit Problems with Covariates." Ann. Statist. 19 (4) 1978 - 2002, December, 1991. https://doi.org/10.1214/aos/1176348382
Information