The Annals of Applied Statistics

Spatial risk mapping for rare disease with hidden Markov fields and variational EM

Florence Forbes, Myriam Charras-Garrido, Lamiae Azizi, Senan Doyle, and David Abrial

Full-text: Open access

Abstract

Current risk mapping models for pooled data focus on the estimated risk for each geographical unit. A risk classification, that is, grouping of geographical units with similar risk, is then necessary to easily draw interpretable maps, with clearly delimited zones in which protection measures can be applied. As an illustration, we focus on the Bovine Spongiform Encephalopathy (BSE) disease that threatened the bovine production in Europe and generated drastic cow culling. This example features typical animal disease risk analysis issues with very low risk values, small numbers of observed cases and population sizes that increase the difficulty of an automatic classification. We propose to handle this task in a spatial clustering framework using a nonstandard discrete hidden Markov model prior designed to favor a smooth risk variation. The model parameters are estimated using an EM algorithm and a mean field approximation for which we develop a new initialization strategy appropriate for spatial Poisson mixtures. Using both simulated and our BSE data, we show that our strategy performs well in dealing with low population sizes and accurately determines high risk regions, both in terms of localization and risk level estimation.

Article information

Source
Ann. Appl. Stat., Volume 7, Number 2 (2013), 1192-1216.

Dates
First available in Project Euclid: 27 June 2013

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1372338484

Digital Object Identifier
doi:10.1214/13-AOAS629

Mathematical Reviews number (MathSciNet)
MR3113506

Zentralblatt MATH identifier
1288.62158

Keywords
Classification discrete hidden Markov random field disease mapping Poisson mixtures Potts model variational EM

Citation

Forbes, Florence; Charras-Garrido, Myriam; Azizi, Lamiae; Doyle, Senan; Abrial, David. Spatial risk mapping for rare disease with hidden Markov fields and variational EM. Ann. Appl. Stat. 7 (2013), no. 2, 1192--1216. doi:10.1214/13-AOAS629. https://projecteuclid.org/euclid.aoas/1372338484


Export citation

References

  • Abrial, D., Calavas, D., Jarrige, N. and Ducrot, C. (2005a). Poultry, pig and the risk of BSE following the feed ban in France—A spatial analysis. Vet. Res. 36 615–628.
  • Abrial, D., Calavas, D., Jarrige, N. and Ducrot, C. (2005b). Spatial heterogeneity of the risk of BSE in France following the ban of meat and bone meal in cattle feed. Prev. Vet. Med. 67 69–82.
  • Alfó, M., Nieddu, L. and Vicari, D. (2009). Finite mixture models for mapping spatially dependent disease counts. Biom. J. 51 84–97.
  • Allepuz, A., Lopez-Quilez, A., Forte, A., Fernandez, G. and Casal, J. (2007). Spatial analysis of bovine spongiform encephalopathy in Galicia, Spain (2002–2005). Prev. Vet. Med. 79 174–185.
  • Besag, J., York, J. and Mollié, A. (1991). Bayesian image restoration, with two applications in spatial statistics. Ann. Inst. Statist. Math. 43 1–59.
  • Biernacki, C. (2004). Initializing EM using the properties of its trajectories in Gaussian mixtures. Stat. Comput. 14 267–279.
  • Biernacki, C., Celeux, G. and Govaert, G. (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput. Statist. Data Anal. 41 561–575.
  • Böhning, D., Dietz, E. and Schlattmann, P. (2000). Space–time mixture modelling of public health data. Stat. Med. 19 2333–2344.
  • Celeux, G., Forbes, F. and Peyrard, N. (2003). EM procedures using mean field-like approximations for Markov model-based image segmentation. Pattern Recognition 36 131–144.
  • Clayton, D. and Bernadinelli, L. (1992). Bayesian methods for mapping disease risk. In Geographical and Environment Epidemiology: Methods for Small Area Studies (P. Elliot, J. Cuzik, D. English and R. Stern, eds.) 205–220. Oxford Univ. Press, Oxford.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 39 1–38.
  • Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology 26 297–302.
  • Fernández, C. and Green, P. J. (2002). Modelling spatially correlated data via mixtures: A Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 805–826.
  • Forbes, F. and Peyrard, N. (2003). Hidden Markov model selection based on mean field like approximations. IEEE Trans. on Pattern Analysis and Machine Intelligence 25 1089–1101.
  • Forbes, F., Charras-Garrido, M., Azizi, L., Doyle, S. and Abrial, D. (2013). Supplement to “Spatial risk mapping for rare disease with hidden Markov fields and variational EM.” DOI:10.1214/13-AOAS629SUPP.
  • Fraley, C. and Raftery, A. E. (2007). Bayesian regularization for normal mixture estimation and model-based clustering. J. Classification 24 155–181.
  • Green, P. J. and Richardson, S. (2002). Hidden Markov models and disease mapping. J. Amer. Statist. Assoc. 97 1055–1070.
  • Hossain, M. M. and Lawson, A. B. (2010). Space–time Bayesian small area disease risk models: Development and evaluation with a focus on cluster detection. Environ. Ecol. Stat. 17 73–95.
  • Karlis, D. and Xekalaki, E. (2003). Choosing initial values for the EM algorithm for finite mixtures. Comput. Statist. Data Anal. 41 577–590.
  • Knorr-Held, L. and Rasser, G. (2000). Bayesian detection of clusters and discontinuities in disease maps. Biometrics 56 13–21.
  • Knorr-Held, L., Raßer, G. and Becker, N. (2002). Disease mapping of stage-specific cancer incidence data. Biometrics 58 492–501.
  • Knorr-Held, L. and Richardson, S. (2003). A hierarchical model for space–time surveillance data on meningococcal disease incidence. J. R. Stat. Soc. Ser. C. Appl. Stat. 52 169–183.
  • Kulldorff, M. (1997). A spatial scan statistic. Comm. Statist. Theory Methods 26 1481–1496.
  • Kulldorff, M. and Information Management Services Inc. (2009). SaTScanTM v8.0: Software for the spatial and space–time scan statistics. Available at http://www.satscan.org/.
  • Kulldorff, M., Huang, L., Pickle, L. and Duczmal, L. (2006). An elliptic spatial scan statistic. Stat. Med. 25 3929–3943.
  • Lawson, A. B. and Song, H.-R. (2010). Bayesian hierarchical modeling of the dynamics of spatio-temporal influenza season outbreaks. Spat. Spatiotemporal Epidemiol. 1 187–195.
  • Lawson, A. B., Biggeri, A. B., Boehning, D., Lesaffre, E., Viel, J. F., Clark, A., Schlattmann, P. and Divino, F. (2000). Disease mapping models: An empirical evaluation. Disease Mapping Collaborative Group. Stat. Med. 19 2217–2241.
  • MacNab, Y. C. (2011). On Gaussian Markov random fields and Bayesian disease mapping. Stat. Methods Med. Res. 20 49–68.
  • McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
  • Mollié, A. (1996). Bayesian mapping of disease. In Markov Chain Monte Carlo in Practice (W. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 359–379. Chapman & Hall, London.
  • Mollié, A. (1999). Bayesian and empirical Bayes approaches to disease mapping. In Disease Mapping and Risk Assessment for Public Health (A. Lawson, A. Biggeri and D. Bohning, eds.) 15–29. Wiley, New York.
  • Mollie, A. and Richardson, S. (1991). Empirical Bayes estimates of cancer mortality rates using spatial models. Stat. Med. 10 95–112.
  • Pascutto, C., Wakefield, J. C., Best, N. G., Richardson, S., Bernardinelli, L., Staines, A. and Elliott, P. (2000). Statistical issues in the analysis of disease mapping data. Stat. Med. 19 2493–2519.
  • Paul, M., Abrial, D., Jarrige, N., Rican, S., Garrido, M., Calavas, D. and Ducrot, C. (2007). Bovine spongiform encephalopathy and spatial analysis of the feed industry. Emerging Infectious Diseases 13 867–872.
  • Richardson, S., Monfort, C., Green, M., Draper, G. and Muirhead, C. (1995). Spatial variation of natural radiation and childhood leukaemia incidence in Great Britain. Stat. Med. 14 2487–2501.
  • Robertson, C., Nelson, T. A., MacNab, Y. C. and Lawson, A. B. (2010). Review of methods for space–time disease surveillance. Spat. Spatiotemporal Epidemiol. 1 105–116.
  • Schlattmann, P. and Böhning, D. (1993). Mixture models and disease mapping. Stat. Med. 12 1943–1950.

Supplemental materials

  • Supplementary material: Supplement to “Spatial risk mapping for rare disease with hidden Markov fields and variational EM”. Missing appendices, tables and figures are available in a companion supplemental file.