Open Access
February 2024 Causal Inference Methods for Combining Randomized Trials and Observational Studies: A Review
Bénédicte Colnet, Imke Mayer, Guanhua Chen, Awa Dieng, Ruohong Li, Gaël Varoquaux, Jean-Philippe Vert, Julie Josse, Shu Yang
Author Affiliations +
Statist. Sci. 39(1): 165-191 (February 2024). DOI: 10.1214/23-STS889


With increasing data availability, causal effects can be evaluated across different data sets, both randomized controlled trials (RCTs) and observational studies. RCTs isolate the effect of the treatment from that of unwanted (confounding) co-occurring effects but they may suffer from unrepresentativeness, and thus lack external validity. On the other hand, large observational samples are often more representative of the target population but can conflate confounding effects with the treatment of interest. In this paper, we review the growing literature on methods for causal inference on combined RCTs and observational studies, striving for the best of both worlds. We first discuss identification and estimation methods that improve generalizability of RCTs using the representativeness of observational data. Classical estimators include weighting, difference between conditional outcome models and doubly robust estimators. We then discuss methods that combine RCTs and observational data to either ensure unconfoundedness of the observational analysis or to improve (conditional) average treatment effect estimation. We also connect and contrast works developed in both the potential outcomes literature and the structural causal model literature. Finally, we compare the main methods using a simulation study and real world data to analyze the effect of tranexamic acid on the mortality rate in major trauma patients. A review of available codes and new implementations is also provided.

Funding Statement

SY is partially supported by NSF SES 2242776, NIH 1R01AG066883 and FDA 1U01FD007934.


This work was initiated by a SAMSI working group jointly led by JJ and SY in the 2020 causal inference program. We would like to acknowledge the helpful discussions during the SAMSI working group meetings. We also would like to acknowledge the discussions and insights from the Traumabase group and physicians, in particular, Drs. François-Xavier AGERON, Tobias GAUSS and Jean-Denis MOYER. In addition, none of the data analysis part could have been done without the help of Dr. Ian ROBERTS and the CRASH-3 group, who shared with us the clinical trial data. Part of this work was performed while JJ was a visiting researcher at Google Brain Paris. Finally, we would like to warmly thank Issa DAHABREH for his comments, suggestions of additional references and insightful discussions.


Download Citation

Bénédicte Colnet. Imke Mayer. Guanhua Chen. Awa Dieng. Ruohong Li. Gaël Varoquaux. Jean-Philippe Vert. Julie Josse. Shu Yang. "Causal Inference Methods for Combining Randomized Trials and Observational Studies: A Review." Statist. Sci. 39 (1) 165 - 191, February 2024.


Published: February 2024
First available in Project Euclid: 18 February 2024

MathSciNet: MR4718532
Digital Object Identifier: 10.1214/23-STS889

Keywords: Causal effect generalization , data integration , double robustness , heterogeneous data , S-admissibility , transportability

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.39 • No. 1 • February 2024
Back to Top