Abstract
Nonparametric density estimation is an unsupervised learning problem. In this work we propose a two-step procedure that casts the density estimation problem in the first step into a supervised regression problem. The advantage is that we can afterwards apply supervised learning methods. Compared to the standard nonparametric regression setting, the proposed procedure creates, however, dependence among the training samples. To derive statistical risk bounds, one can therefore not rely on the well-developed theory for i.i.d. data. To overcome this, we prove an oracle inequality for this specific form of data dependence. As an application, it is shown that under a compositional structure assumption on the underlying density, the proposed two-step method achieves convergence rates that are faster than the standard nonparametric rates. A simulation study illustrates the finite sample performance.
Funding Statement
The research has been supported by the NWO/STAR grant 613.009.034b and the NWO Vidi grant VI.Vidi.192.021.
Acknowledgments
We are extremely grateful for the detailed comments that we received from the two referees. One referee suggested several improvements including a more streamlined Poissonization argument in the proof of Lemma 7.2. We want to thank Claire Donnat for pointing us to Lindsey’s method and we are grateful to Kaizheng Wang for inspiring discussions.
Citation
Thijs Bos. Johannes Schmidt-Hieber. "A supervised deep learning method for nonparametric density estimation." Electron. J. Statist. 18 (2) 5601 - 5658, 2024. https://doi.org/10.1214/24-EJS2332
Information