August 2022 Stochastic continuum-armed bandits with additive models: Minimax regrets and adaptive algorithm
T. Tony Cai, Hongming Pu
Author Affiliations +
Ann. Statist. 50(4): 2179-2204 (August 2022). DOI: 10.1214/22-AOS2182

Abstract

We consider d-dimensional stochastic continuum-armed bandits with the expected reward function being additive β-Hölder with sparsity s for 0<β< and 1sd. The rate of convergence O˜(s·Tβ+12β+1) for the minimax regret is established where T is the number of rounds. In particular, the minimax regret does not depend on d and is linear in s. A novel algorithm is proposed and is shown to be rate-optimal, up to a logarithmic factor of T.

The problem of adaptivity is also studied. A lower bound on the cost of adaptation to the smoothness is obtained and the result implies that adaptation for free is impossible in general without further structural assumptions. We then consider adaptive additive SCAB under an additional self-similarity assumption. An adaptive procedure is constructed and is shown to simultaneously achieve the minimax regret for a range of smoothness levels.

Funding Statement

The research was supported in part by NSF Grant DMS-2015259 and NIH Grants R01-GM129781 and R01-GM123056.

Acknowledgments

We would like to thank the Associate Editor and the referees for their detailed and constructive comments which have helped to improve the presentation of the paper.

Citation

Download Citation

T. Tony Cai. Hongming Pu. "Stochastic continuum-armed bandits with additive models: Minimax regrets and adaptive algorithm." Ann. Statist. 50 (4) 2179 - 2204, August 2022. https://doi.org/10.1214/22-AOS2182

Information

Received: 1 August 2021; Revised: 1 February 2022; Published: August 2022
First available in Project Euclid: 25 August 2022

MathSciNet: MR4474487
zbMATH: 07610767
Digital Object Identifier: 10.1214/22-AOS2182

Subjects:
Primary: 62G08
Secondary: 62L12

Keywords: Adaptivity , Additive model , Bandits , communication constraints , curse of dimensionality , minimax lower bound , Optimal rate of convergence , regret , self-similarity

Rights: Copyright © 2022 Institute of Mathematical Statistics

JOURNAL ARTICLE
26 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.50 • No. 4 • August 2022
Back to Top