Open Access
April 2019 Generalized random forests
Susan Athey, Julie Tibshirani, Stefan Wager
Ann. Statist. 47(2): 1148-1178 (April 2019). DOI: 10.1214/18-AOS1709


We propose generalized random forests, a method for nonparametric statistical estimation based on random forests (Breiman [Mach. Learn. 45 (2001) 5–32]) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations. Following the literature on local maximum likelihood estimation, our method considers a weighted set of nearby training examples; however, instead of using classical kernel weighting functions that are prone to a strong curse of dimensionality, we use an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest. We propose a flexible, computationally efficient algorithm for growing generalized random forests, develop a large sample theory for our method showing that our estimates are consistent and asymptotically Gaussian and provide an estimator for their asymptotic variance that enables valid confidence intervals. We use our approach to develop new methods for three statistical tasks: nonparametric quantile regression, conditional average partial effect estimation and heterogeneous treatment effect estimation via instrumental variables. A software implementation, grf for R and C++, is available from CRAN.


Download Citation

Susan Athey. Julie Tibshirani. Stefan Wager. "Generalized random forests." Ann. Statist. 47 (2) 1148 - 1178, April 2019.


Received: 1 July 2017; Revised: 1 April 2018; Published: April 2019
First available in Project Euclid: 11 January 2019

zbMATH: 07033164
MathSciNet: MR3909963
Digital Object Identifier: 10.1214/18-AOS1709

Primary: 62G05

Keywords: Asymptotic theory , Causal inference , instrumental variable

Rights: Copyright © 2019 Institute of Mathematical Statistics

Vol.47 • No. 2 • April 2019
Back to Top