December 2024 Statistical complexity and optimal algorithms for nonlinear ridge bandits
Nived Rajaraman, Yanjun Han, Jiantao Jiao, Kannan Ramchandran
Author Affiliations +
Ann. Statist. 52(6): 2557-2582 (December 2024). DOI: 10.1214/24-AOS2395

Abstract

We consider the sequential decision-making problem where the mean outcome is a nonlinear function of the chosen action. Compared with the linear model, two curious phenomena arise in nonlinear models: first, in addition to the “learning phase” with a standard parametric rate for estimation or regret, there is an “burn-in period” with a fixed cost determined by the nonlinear function; second, achieving the smallest burn-in cost requires new exploration algorithms. For a special family of nonlinear functions named ridge functions in the literature, we derive upper and lower bounds on the optimal burn-in cost, and in addition, on the entire learning trajectory during the burn-in period via differential equations. In particular, a two-stage algorithm that first finds a good initial action and then treats the problem as locally linear is statistically optimal. In contrast, several classical algorithms, such as UCB and algorithms relying on regression oracles, are provably suboptimal.

Funding Statement

Nived Rajaraman and Jiantao Jiao were partially supported by NSF Grants IIS-1901252 and CIF-2211209. Yanjun Han was supported by the Simons-Berkeley research fellowship and the Norbert Wiener postdoctoral fellowship.

Citation

Download Citation

Nived Rajaraman. Yanjun Han. Jiantao Jiao. Kannan Ramchandran. "Statistical complexity and optimal algorithms for nonlinear ridge bandits." Ann. Statist. 52 (6) 2557 - 2582, December 2024. https://doi.org/10.1214/24-AOS2395

Information

Received: 1 March 2023; Revised: 1 January 2024; Published: December 2024
First available in Project Euclid: 18 December 2024

Digital Object Identifier: 10.1214/24-AOS2395

Subjects:
Primary: 62L12
Secondary: 62C20 , 62K05

Keywords: adaptive sampling , bandit problems , Minimax rate , regret bounds , ridge functions , sequential estimation

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.52 • No. 6 • December 2024
Back to Top