Open Access
Translator Disclaimer
March 2022 Bias Correction in Clustered Underreported Data
Guilherme Lopes de Oliveira, Raffaele Argiento, Rosangela Helena Loschi, Renato Martins Assunção, Fabrizio Ruggeri, Márcia D’Elia Branco
Author Affiliations +
Bayesian Anal. 17(1): 95-126 (March 2022). DOI: 10.1214/20-BA1244


Data quality from poor and socially deprived regions have given rise to many statistical challenges. One of them is the underreporting of vital events leading to biased estimates for the associated risks. To deal with underreported count data, models based on compound Poisson distributions have been commonly assumed. To be identifiable, such models usually require extra and strong information about the probability of reporting the event in all areas of interest, which is not always available. We introduce a novel approach for the compound Poisson model assuming that the areas are clustered according to their data quality. We leverage these clusters to create a hierarchical structure in which the reporting probabilities decrease as we move from the best group to the worst ones. We obtain constraints for model identifiability and prove that only prior information about the reporting probability in areas experiencing the best data quality is required. Several approaches to model the uncertainty about the reporting probabilities are presented, including reference priors. Different features regarding the proposed methodology are studied through simulation. We apply our model to map the early neonatal mortality risks in Minas Gerais, a Brazilian state that presents heterogeneous characteristics and a relevant socio-economical inequality.


Download Citation

Guilherme Lopes de Oliveira. Raffaele Argiento. Rosangela Helena Loschi. Renato Martins Assunção. Fabrizio Ruggeri. Márcia D’Elia Branco. "Bias Correction in Clustered Underreported Data." Bayesian Anal. 17 (1) 95 - 126, March 2022.


Published: March 2022
First available in Project Euclid: 25 September 2020

Digital Object Identifier: 10.1214/20-BA1244

Primary: 62F15
Secondary: 62J12

Keywords: compound Poisson model , generalized Beta distribution , Jeffreys prior , model identifiability , neonatal mortality , underreporting


Vol.17 • No. 1 • March 2022
Back to Top