Open Access
June 2018 Bayesian propagation of record linkage uncertainty into population size estimation of human rights violations
Mauricio Sadinle
Ann. Appl. Stat. 12(2): 1013-1038 (June 2018). DOI: 10.1214/18-AOAS1178

Abstract

Multiple-systems or capture–recapture estimation are common techniques for population size estimation, particularly in the quantitative study of human rights violations. These methods rely on multiple samples from the population, along with the information of which individuals appear in which samples. The goal of record linkage techniques is to identify unique individuals across samples based on the information collected on them. Linkage decisions are subject to uncertainty when such information contains errors and missingness, and when different individuals have very similar characteristics. Uncertainty in the linkage should be propagated into the stage of population size estimation. We propose an approach called linkage-averaging to propagate linkage uncertainty, as quantified by some Bayesian record linkage methodologies, into a subsequent stage of population size estimation. Linkage-averaging is a two-stage approach in which the results from the record linkage stage are fed into the population size estimation stage. We show that under some conditions the results of this approach correspond to those of a proper Bayesian joint model for both record linkage and population size estimation. The two-stage nature of linkage-averaging allows us to combine different record linkage models with different capture–recapture models, which facilitates model exploration. We present a case study from the Salvadoran civil war, where we are interested in estimating the total number of civilian killings using lists of witnesses’ reports collected by different organizations. These lists contain duplicates, typographical and spelling errors, missingness, and other inaccuracies that lead to uncertainty in the linkage. We show how linkage-averaging can be used for transferring the uncertainty in the linkage of these lists into different models for population size estimation.

Citation

Download Citation

Mauricio Sadinle. "Bayesian propagation of record linkage uncertainty into population size estimation of human rights violations." Ann. Appl. Stat. 12 (2) 1013 - 1038, June 2018. https://doi.org/10.1214/18-AOAS1178

Information

Received: 1 November 2017; Revised: 1 April 2018; Published: June 2018
First available in Project Euclid: 28 July 2018

zbMATH: 06980483
MathSciNet: MR3834293
Digital Object Identifier: 10.1214/18-AOAS1178

Keywords: capture–recapture , counting casualties , data linkage , decomposable graphical model , duplicate detection , entity resolution , multiple record linkage , multiple-systems estimation

Rights: Copyright © 2018 Institute of Mathematical Statistics

Vol.12 • No. 2 • June 2018
Back to Top