Abstract
We propose a theoretical study of two realistic estimators of conditional distribution functions using random forests. The estimation process uses the bootstrap samples generated from the original dataset when constructing the forest. Bootstrap samples are reused to define the first estimator, while the second uses the original sample, once the forest has been built. We prove that both proposed estimators of the conditional distribution functions are consistent uniformly a.s. To the best of our knowledge, it is the first proof of a.s. consistency (previous consistency results are in norm or in probability) and including the bootstrap part. The consistency result holds for a large class of functions, including additive models and products. The consistency of conditional quantiles estimators follows that of distribution functions estimators using standard arguments.
Acknowledgments
We are grateful to Andrés Cuberos, Ecaterina Nisipasu, Mathieu Poulin and Przemyslaw Sloma from SCOR for their valuable comments and support. We are also much indebted to Roland Denis and Benoit Fabrèges for intensive support on computational aspects. We are grateful to anonymous reviewers of previous versions of the paper, their comments helped to improve the paper.
Citation
Kévin Elie-Dit-Cosaque . Véronique Maume-Deschamps. "Random forest estimation of conditional distribution functions and conditional quantiles." Electron. J. Statist. 16 (2) 6553 - 6583, 2022. https://doi.org/10.1214/22-EJS2094