The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 9, Number 2 (2015), 572-596.
Sex, lies and self-reported counts: Bayesian mixture models for heaping in longitudinal count data via birth–death processes
Surveys often ask respondents to report nonnegative counts, but respondents may misremember or round to a nearby multiple of 5 or 10. This phenomenon is called heaping, and the error inherent in heaped self-reported numbers can bias estimation. Heaped data may be collected cross-sectionally or longitudinally and there may be covariates that complicate the inferential task. Heaping is a well-known issue in many survey settings, and inference for heaped data is an important statistical problem. We propose a novel reporting distribution whose underlying parameters are readily interpretable as rates of misremembering and rounding. The process accommodates a variety of heaping grids and allows for quasi-heaping to values nearly but not equal to heaping multiples. We present a Bayesian hierarchical model for longitudinal samples with covariates to infer both the unobserved true distribution of counts and the parameters that control the heaping process. Finally, we apply our methods to longitudinal self-reported counts of sex partners in a study of high-risk behavior in HIV-positive youth.
Ann. Appl. Stat., Volume 9, Number 2 (2015), 572-596.
Received: May 2014
Revised: February 2015
First available in Project Euclid: 20 July 2015
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Crawford, Forrest W.; Weiss, Robert E.; Suchard, Marc A. Sex, lies and self-reported counts: Bayesian mixture models for heaping in longitudinal count data via birth–death processes. Ann. Appl. Stat. 9 (2015), no. 2, 572--596. doi:10.1214/15-AOAS809. https://projecteuclid.org/euclid.aoas/1437397102
- Supplemental article. We provide a derivation of the Laplace transform of transition probabilities for a general BDP, the full posterior distribution and an outline of Monte Carlo sampling procedures for unknown parameters.