Abstract
In this paper we consider the estimation of population size from one-source capture–recapture data, that is, a list in which individuals can potentially be found repeatedly and where the question is how many individuals are missed by the list. As a typical example, we provide data from a drug user study in Bangkok from 2001 where the list consists of drug users who repeatedly contact treatment institutions. Drug users with 1, 2, 3, … contacts occur, but drug users with zero contacts are not present, requiring the size of this group to be estimated. Statistically, these data can be considered as stemming from a zero-truncated count distribution. We revisit an estimator for the population size suggested by Zelterman that is known to be robust under potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a locally truncated Poisson likelihood which is equivalent to a binomial likelihood. This result allows the extension of the Zelterman estimator by means of logistic regression to include observed heterogeneity in the form of covariates. We also review an estimator proposed by Chao and explain why we are not able to obtain similar results for this estimator. The Zelterman estimator is applied in two case studies, the first a drug user study from Bangkok, the second an illegal immigrant study in the Netherlands. Our results suggest the new estimator should be used, in particular, if substantial unobserved heterogeneity is present.
Version Information
The current Supplemental Content downloadable file supersedes the original version posted on 22 June 2009. The data file "Supplement C.txt" now appears in its entirety.
Citation
Dankmar Böhning. Peter G. M. van der Heijden. "A covariate adjustment for zero-truncated approaches to estimating the size of hidden and elusive populations." Ann. Appl. Stat. 3 (2) 595 - 610, June 2009. https://doi.org/10.1214/08-AOAS214
Information