Translator Disclaimer
March 2021 A multiple imputation procedure for record linkage and causal inference to estimate the effects of home-delivered meals
Mingyang Shan, Kali S. Thomas, Roee Gutman
Author Affiliations +
Ann. Appl. Stat. 15(1): 412-436 (March 2021). DOI: 10.1214/20-AOAS1397


Causal analysis of observational studies requires data that comprise a set of covariates, a treatment assignment indicator and the observed outcomes. However, data confidentiality restrictions or the nature of data collection may distribute these variables across two or more datasets. In the absence of unique identifiers to link records across files, probabilistic record linkage algorithms can be leveraged to merge the datasets. Current applications of record linkage are concerned with estimation of associations between variables that are exclusive to one file and not causal relationships. We propose a Bayesian framework for record linkage and causal inference where one file comprises all the covariate and observed outcome information, and the second file consists of a list of all individuals who receive the active treatment. Under certain ignorability assumptions, the procedure properly propagates the error in the record linkage process, resulting in valid statistical inferences. To estimate the causal effects, we devise a two-stage procedure. The first stage of the procedure performs Bayesian record linkage to multiply-impute the treatment assignment for all individuals in the first file, while adjustments for covariates’ imbalance and imputation of missing potential outcomes are performed in the second stage. This procedure is used to evaluate the effect of Meals on Wheels services on mortality and healthcare utilization among homebound older adults in Rhode Island. In addition, an interpretable sensitivity analysis is developed to assess potential violations of the ignorability assumptions.


This work was supported in part by a grant from the Gary and Mary West Foundation and a grant from the Patient-Centered Outcomes Research Institute (PCORI/ME-2017C3-9241).


Download Citation

Mingyang Shan. Kali S. Thomas. Roee Gutman. "A multiple imputation procedure for record linkage and causal inference to estimate the effects of home-delivered meals." Ann. Appl. Stat. 15 (1) 412 - 436, March 2021.


Received: 1 December 2019; Revised: 1 August 2020; Published: March 2021
First available in Project Euclid: 18 March 2021

Digital Object Identifier: 10.1214/20-AOAS1397

Keywords: Bayesian data analysis , Causal inference , missing data , multiple imputation , record linkage

Rights: Copyright © 2021 Institute of Mathematical Statistics


This article is only available to subscribers.
It is not available for individual sale.

Vol.15 • No. 1 • March 2021
Back to Top