## The Annals of Applied Statistics

### Direct likelihood-based inference for discretely observed stochastic compartmental models of infectious disease

#### Abstract

Stochastic compartmental models are important tools for understanding the course of infectious diseases epidemics in populations and in prospective evaluation of intervention policies. However, calculating the likelihood for discretely observed data from even simple models—such as the ubiquitous susceptible-infectious-removed (SIR) model—has been considered computationally intractable, since its formulation almost a century ago. Recently researchers have proposed methods to circumvent this limitation through data augmentation or approximation, but these approaches often suffer from high computational cost or loss of accuracy. We develop the mathematical foundation and an efficient algorithm to compute the likelihood for discretely observed data from a broad class of stochastic compartmental models. We also give expressions for the derivatives of the transition probabilities using the same technique, making possible inference via Hamiltonian Monte Carlo (HMC). We use the 17th century plague in Eyam, a classic example of the SIR model, to compare our recursion method to sequential Monte Carlo, analyze using HMC, and assess the model assumptions. We also apply our direct likelihood evaluation to perform Bayesian inference for the 2014–2015 Ebola outbreak in Guinea. The results suggest that the epidemic infectious rates have decreased since October 2014 in the Southeast region of Guinea, while rates remain the same in other regions, facilitating understanding of the outbreak and the effectiveness of Ebola control interventions.

#### Article information

Source
Ann. Appl. Stat., Volume 12, Number 3 (2018), 1993-2021.

Dates
Revised: November 2017
First available in Project Euclid: 11 September 2018

https://projecteuclid.org/euclid.aoas/1536652983

Digital Object Identifier
doi:10.1214/18-AOAS1141

Mathematical Reviews number (MathSciNet)
MR3852706

#### Citation

Ho, Lam Si Tung; Crawford, Forrest W.; Suchard, Marc A. Direct likelihood-based inference for discretely observed stochastic compartmental models of infectious disease. Ann. Appl. Stat. 12 (2018), no. 3, 1993--2021. doi:10.1214/18-AOAS1141. https://projecteuclid.org/euclid.aoas/1536652983

#### References

• Abate, J. and Whitt, W. (1992). The Fourier-series method for inverting transforms of probability distributions. Queueing Syst. 10 5–87.
• Althaus, C. L. (2014). Estimating the reproduction number of Ebola virus (EBOV) during the 2014 outbreak in West Africa. PLOS Currents Outbreaks 6.
• Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 269–342.
• Arulampalam, M. S., Maskell, S., Gordon, N. and Clapp, T. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Signal Process. 50 174–188.
• Becker, N. G. and Britton, T. (1999). Statistical studies of infectious disease incidence. J. R. Stat. Soc. Ser. B. Stat. Methodol. 61 287–307.
• Blum, M. G. and Tran, V. C. (2010). HIV with contact tracing: A case study in approximate Bayesian computation. Biostatistics 11 644–660.
• Brauer, F. (2008). Compartmental models in epidemiology. In Mathematical Epidemiology. Lecture Notes in Math. 1945 19–79. Springer, Berlin.
• Cauchemez, S. and Ferguson, N. M. (2008). Likelihood-based estimation of continuous-time epidemic models from time-series data: Application to measles transmission in London. J. R. Soc. Interface 5 885–897.
• Cox, J. C., Ingersoll, J. E. Jr. and Ross, S. A. (1985). A theory of the term structure of interest rates. Econometrica 53 385–407.
• Crawford, F. W., Stutz, T. C. and Lange, K. (2016). Coupling bounds for approximating birth-death processes by truncation. Statist. Probab. Lett. 109 30–38.
• Crawford, F. W. and Suchard, M. A. (2012). Transition probabilities for general birth-death processes with applications in ecology, genetics, and evolution. J. Math. Biol. 65 553–580.
• Csilléry, K., Blum, M. G., Gaggiotti, O. E. and François, O. (2010). Approximate Bayesian computation (ABC) in practice. Trends Ecol. Evol. 25 410–418.
• de Donder, T., van den Dungen, F. and van Lerberghe, G. (1920). Leçons de Thermodynamique et de Chimie Physique. Number V. 1 in Leçons de Thermodynamique et de Chimie Physique. Gauthier-Villars, Paris.
• Dukic, V., Lopes, H. F. and Polson, N. G. (2012). Tracking epidemics with Google Flu Trends data and a state–space SEIR model. J. Amer. Statist. Assoc. 107 1410–1426.
• Duong, T. (2007). ks: Kernel density estimation and kernel discriminant analysis for multivariate data in R. J. Stat. Softw. 21 (7) 1–16.
• Faddy, M. J. (1977). Stochastic compartmental models as approximations to more general stochastic systems with the general stochastic epidemic as an example. Adv. in Appl. Probab. 9 448–461.
• Feller, W. (1968). An Introduction to Probability Theory and Its Applications. Vol. I, 3rd ed. Wiley, New York.
• Gibson, G. J. and Renshaw, E. (1998). Estimating parameters in stochastic compartmental models using Markov chain methods. Math. Med. Biol. 15 19–40.
• Golightly, A. and Wilkinson, D. J. (2005). Bayesian inference for stochastic kinetic models using a diffusion approximation. Biometrics 61 781–788.
• Ho, L. S. T., Xu, J., Crawford, F. W., Minin, V. N. and Suchard, M. A. (2018). Birth/birth-death processes and their computable transition probabilities with biological applications. J. Math. Biol. 76 911–944.
• Ionides, E., Bretó, C. and King, A. (2006). Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 103 18438–18443.
• Karev, G. P., Berezovskaya, F. S. and Koonin, E. V. (2005). Modeling genome evolution with a diffusion approximation of a birth-and-death process. Bioinformatics 21 iii12–iii19.
• Kermack, W. and McKendrick, A. (1927). A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London. Series A 115 700–721.
• King, A. A., Nguyen, D. and Ionides, E. L. (2016). Statistical inference for partially observed Markov processes via the R package pomp. J. Stat. Softw. 69 1–43.
• Levin, D. (1973). Development of non-linear transformations of improving convergence of sequences. Int. J. Comput. Math. 3 371–388.
• McKendrick, A. (1926). Applications of mathematics to medical problems. Proceedings of the Edinburgh Mathematics Society 44 98–130.
• Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo. 113–162. CRC Press, Boca Raton, FL.
• O’Neill, P. D. (2002). A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte Carlo methods. Math. Biosci. 180 103–114.
• O’Neill, P. D. and Roberts, G. O. (1999). Bayesian inference for partially observed stochastic epidemics. J. Roy. Statist. Soc. Ser. A 162 121–129.
• O’Neill, P. D. and Wen, C. H. (2012). Modelling and inference for epidemic models featuring non-linear infection pressure. Math. Biosci. 238 38–48.
• Owen, J., Wilkinson, D. J. and Gillespie, C. S. (2015). Scalable inference for Markov processes with intractable likelihoods. Stat. Comput. 25 145–156.
• Raggett, G. (1982). A stochastic model of the Eyam plague. J. Appl. Stat. 9 212–225.
• Renshaw, E. (2011). Stochastic Population Processes: Analysis, Approximations, Simulations. Oxford Univ. Press, Oxford.
• Reuter, G. E. H. (1957). Denumerable Markov processes and the associated contraction semigroups on $l$. Acta Math. 97 1–46.
• Robert, C. P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. S. (2011). Lack of confidence in approximate Bayesian computation model choice. Proc. Natl. Acad. Sci. USA 108 15112–15117.
• Roberts, M., Andreasen, V., Lloyd, A. and Pellis, L. (2015). Nine challenges for deterministic epidemic models. Epidemics 10 49–53.
• Schranz, H. W., Yap, V. B., Easteal, S., Knight, R. and Huttley, G. A. (2008). Pathological rate matrices: From primates to pathogens. BMC Bioinform. 9 550.
• Severo, N. C. (1969). Generalizations of some stochastic epidemic models. Math. Biosci. 4 395–402.
• Sidje, R. B. (1998). Expokit: A software package for computing matrix exponentials. ACM Trans. Math. Software 24 130–156.
• Sunnåker, M., Busetto, A. G., Numminen, E., Corander, J., Foll, M. and Dessimoz, C. (2013). Approximate Bayesian computation. PLoS Comput. Biol. 9 e1002803, 10.
• Verdinelli, I. and Wasserman, L. (1995). Computing Bayes factors using a generalization of the Savage–Dickey density ratio. J. Amer. Statist. Assoc. 90 614–618.
• WHO Ebola Response Team (2014). Ebola virus disease in West Africa—The first 9 months of the epidemic and forward projections. N. Engl. J. Med. 371 1481–1495.
• WHO Ebola Response Team (2015). West African Ebola epidemic after one year-slowing but not yet under control. N. Engl. J. Med. 372 584–587.
• World Health Organization (2015). Statement on the 4th meeting of the IHR Emergency Committee on the 2014 Ebola outbreak in West Africa. World Health Organization, IHR Emergency Committee regarding Ebola.