The Annals of Applied Statistics

The identity of the zero-truncated, one-inflated likelihood and the zero-one-truncated likelihood for general count densities with an application to drink-driving in Britain

Dankmar Böhning and Peter G. M. van der Heijden

Full-text: Open access

Abstract

For zero-truncated count data, as they typically arise in capture-recapture modelling, we consider modelling under one-inflation. This is motivated by police data on drink-driving in Britain which shows high one-inflation. The data, which are used here, are from the years 2011 to 2015 and are based on DR10 endorsements. We show that inference for an arbitrary count density with one-inflation can be equivalently based upon the associated zero-one truncated count density. This simplifies inference considerably including maximum likelihood estimation and likelihood ratio testing. For the drink-driving application, we use the geometric distribution which shows a good fit. We estimate the total drink-driving as about $2{,}300{,}000$ drink-drivers in the observational period. As $227{,}578$ were observed, this means that only about 10% of the drink-driving population is observed with a bootstrap confidence interval of 9%–12%.

Article information

Source
Ann. Appl. Stat., Volume 13, Number 2 (2019), 1198-1211.

Dates
Received: May 2018
Revised: November 2018
First available in Project Euclid: 17 June 2019

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1560758443

Digital Object Identifier
doi:10.1214/18-AOAS1232

Mathematical Reviews number (MathSciNet)
MR3963568

Zentralblatt MATH identifier
07094851

Keywords
Capture–recapture Chao estimator behavioral response power series distribution mixture model zero-truncated model nonparametric estimator of population size

Citation

Böhning, Dankmar; van der Heijden, Peter G. M. The identity of the zero-truncated, one-inflated likelihood and the zero-one-truncated likelihood for general count densities with an application to drink-driving in Britain. Ann. Appl. Stat. 13 (2019), no. 2, 1198--1211. doi:10.1214/18-AOAS1232. https://projecteuclid.org/euclid.aoas/1560758443


Export citation

References

  • Anan, O., Böhning, D. and Maruotti, A. (2017). Uncertainty estimation in heterogeneous capture-recapture count data. J. Stat. Comput. Simul. 87 2094–2114.
  • Böhning, D., Bunge, J. and van der Heijden, P. G. M. (2018). Uncertainty assessment in capture-recapture studies and the choice of sampling effort. In Capture-Recapture Methods for the Social and Medical Sciences (D. Böhning, P. G. M. van der Heijden and J. Bunge, eds.) 389–394. CRC Press/CRC, Boca Raton, FL.
  • Böhning, D., van der Heijden, P. G. M. and Bunge, J. (2018). Capture–Recapture Methods for the Social and Medical Sciences. CRC Press/CRC, Boca Raton, FL.
  • Borchers, D. L., Buckland, S. T. and Zucchini, W. (2004). Estimating Animal Abundance. Closed Populations. Springer, Heidelberg.
  • Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review. J. Amer. Statist. Assoc. 88 364–373.
  • Bunge, J., Willis, A. and Walsh, F. (2014). Estimating the number of species in microbial diversity studies. Ann. Rev. Stat. Appl. 1 427–445.
  • Department of Transport (2018). Driving Licences. Available at https://www.ethnicity-facts-figures.service.gov.uk/culture-and-community/transport/driving-licences/latest, accessed 1/10/2018.
  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability 57. CRC Press, New York.
  • Godwin, R. T. and Böhning, D. (2017). Estimation of the population size by using the one-inflated positive Poisson model. J. R. Stat. Soc. Ser. C. Appl. Stat. 66 425–448.
  • Keribin, C. (2000). Consistent estimation of the order of mixture models. Sankhyā Ser. A 62 49–66.
  • McCrea, R. S. and Morgan, B. J. T. (2015). Analysis of Capture-Recapture Data. Chapman & Hall/CRC Interdisciplinary Statistics Series. CRC Press, Boca Raton, FL.
  • Puig, P. and Kokonendji, C. C. (2018). Non-parametric estimation of the number of zeros in truncated count distributions. Scand. J. Stat. 45 347–365.
  • Ray, S. and Lindsay, B. G. (2008). Model selection in high dimensions: A quadratic-risk-based approach. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 95–118.
  • Road accidents and safety statistics (2018). Statistics and data about reported accidents and casualties on public roads. Available at https://www.gov.uk/government/collections/road-accidents-and-safety-statistics. Accessed 1/10/2018.
  • Sanathanan, L. (1977). Estimating the size of a truncated sample. J. Amer. Statist. Assoc. 72 669–672.
  • Self, S. G. and Liang, K.-Y. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Amer. Statist. Assoc. 82 605–610.
  • The Guardian (2016). More than 8000 people caught drink-driving twice in five years. (accessed 23/5/2018), in print p. 11. Available at www.theguardian.com/society/2016/dec/30/more-than-8000-people-caught-drink-driving-twice-in-five-years.
  • van der Heijden, P. G. M., Bustami, R., Cruyff, M. J. L. F., Engbersen, G. and van Houwelingen, H. C. (2003). Point and interval estimation of the population size using the truncated Poisson regression model. Stat. Model. 3 305–322.
  • Wang, J.-P. Z. and Lindsay, B. G. (2005). A penalized nonparametric maximum likelihood approach to species richness estimation. J. Amer. Statist. Assoc. 100 942–959.
  • Wang, J.-P. and Lindsay, B. G. (2008). An exponential partial prior for improving nonparametric maximum likelihood estimation in mixture models. Stat. Methodol. 5 30–45.