Open Access
September 2020 Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects (with Discussion)
P. Richard Hahn, Jared S. Murray, Carlos M. Carvalho
Author Affiliations +
Bayesian Anal. 15(3): 965-1056 (September 2020). DOI: 10.1214/19-BA1195

Abstract

This paper presents a novel nonlinear regression model for estimating heterogeneous treatment effects, geared specifically towards situations with small effect sizes, heterogeneous effects, and strong confounding by observables. Standard nonlinear regression models, which may work quite well for prediction, have two notable weaknesses when used to estimate heterogeneous treatment effects. First, they can yield badly biased estimates of treatment effects when fit to data with strong confounding. The Bayesian causal forest model presented in this paper avoids this problem by directly incorporating an estimate of the propensity function in the specification of the response model, implicitly inducing a covariate-dependent prior on the regression function. Second, standard approaches to response surface modeling do not provide adequate control over the strength of regularization over effect heterogeneity. The Bayesian causal forest model permits treatment effect heterogeneity to be regularized separately from the prognostic effect of control variables, making it possible to informatively “shrink to homogeneity”. While we focus on observational data, our methods are equally useful for inferring heterogeneous treatment effects from randomized controlled experiments where careful regularization is somewhat less complicated but no less important. We illustrate these benefits via the reanalysis of an observational study assessing the causal effects of smoking on medical expenditures as well as extensive simulation studies.

Version Information

A previous version of the manuscript included a Contributed Discussion by Kolyan Ray, Botond Szabo, and Aad van der Vaart that has been updated after publication to ensure correspondence of the methods used by the authors in their Discussion with the ones used in the main manuscript.

Note

BA Webinar: https://www.youtube.com/watch?v=rIijEBrXTrE

Version Information

A previous version of the manuscript included a Contributed Discussion by Kolyan Ray, Botond Szabo, and Aad van der Vaart that has been updated after publication to ensure correspondence of the methods used by the authors in their Discussion with the ones used in the main manuscript.

Citation

Download Citation

P. Richard Hahn. Jared S. Murray. Carlos M. Carvalho. "Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects (with Discussion)." Bayesian Anal. 15 (3) 965 - 1056, September 2020. https://doi.org/10.1214/19-BA1195

Information

Published: September 2020
First available in Project Euclid: 31 January 2020

MathSciNet: MR4154846
Digital Object Identifier: 10.1214/19-BA1195

Subjects:
Primary: 62-07 , 62J02
Secondary: 62F15

Keywords: Bayesian , Causal inference , heterogeneous treatment effects , machine learning , predictor-dependent priors , regression trees , regularization , shrinkage

Vol.15 • No. 3 • September 2020
Back to Top