April 2023 Nonparametric learning for impulse control problems—Exploration vs. exploitation
Sören Christensen, Claudia Strauch
Author Affiliations +
Ann. Appl. Probab. 33(2): 1569-1587 (April 2023). DOI: 10.1214/22-AAP1849

Abstract

One of the fundamental assumptions in stochastic control of continuous time processes is that the dynamics of the underlying (diffusion) process is known. This is, however, usually obviously not fulfilled in practice. On the other hand, over the last decades, a rich theory for nonparametric estimation of the drift (and volatility) for continuous time processes has been developed. The aim of this paper is bringing together techniques from stochastic control with methods from statistics for stochastic processes to find a way to both learn the dynamics of the underlying process and control in a reasonable way at the same time. More precisely, we study a long-term average impulse control problem, a stochastic version of the classical Faustmann timber harvesting problem. One of the problems that immediately arises is an exploration-exploitation dilemma as is well known for problems in machine learning. We propose a way to deal with this issue by combining exploration and exploitation periods in a suitable way. Our main finding is that this construction can be based on the rates of convergence of estimators for the invariant density. Using this, we obtain that the average cumulated regret is of uniform order O(T1/3).

Funding Statement

The second author gratefully acknowledges financial support of Sapere Aude: DFF-Starting Grant 0165-00061B “Learning diffusion dynamics and strategies for optimal control”.

Acknowledgments

The authors would like to thank the anonymous referees for their constructive comments that improved the quality of this paper.

Citation

Download Citation

Sören Christensen. Claudia Strauch. "Nonparametric learning for impulse control problems—Exploration vs. exploitation." Ann. Appl. Probab. 33 (2) 1569 - 1587, April 2023. https://doi.org/10.1214/22-AAP1849

Information

Received: 1 April 2020; Revised: 1 January 2022; Published: April 2023
First available in Project Euclid: 21 March 2023

zbMATH: 07692297
MathSciNet: MR4564434
Digital Object Identifier: 10.1214/22-AAP1849

Subjects:
Primary: 62M05 , 68T05 , 93E20
Secondary: 60J60

Keywords: Diffusion processes , exploration vs. exploitation , Faustmann problem , nonparametric statistics , optimal harvesting problem , reinforcement learning , stochastic impulse control

Rights: Copyright © 2023 Institute of Mathematical Statistics

JOURNAL ARTICLE
19 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

Vol.33 • No. 2 • April 2023
Back to Top