Bayesian Analysis

A Bayesian Nonparametric Spiked Process Prior for Dynamic Model Selection

Alberto Cassese, Weixuan Zhu, Michele Guindani, and Marina Vannucci

Full-text: Open access

Abstract

In many applications, investigators monitor processes that vary in space and time, with the goal of identifying temporally persistent and spatially localized departures from a baseline or “normal” behavior. In this manuscript, we consider the monitoring of pneumonia and influenza (P&I) mortality, to detect influenza outbreaks in the continental United States, and propose a Bayesian nonparametric model selection approach to take into account the spatio-temporal dependence of outbreaks. More specifically, we introduce a zero-inflated conditionally identically distributed species sampling prior which allows borrowing information across time and to assign data to clusters associated to either a null or an alternate process. Spatial dependences are accounted for by means of a Markov random field prior, which allows to inform the selection based on inferences conducted at nearby locations. We show how the proposed modeling framework performs in an application to the P&I mortality data and in a simulation study, and compare with common threshold methods for detecting outbreaks over time, with more recent Markov switching based models, and with spike-and-slab Bayesian nonparametric priors that do not take into account spatio-temporal dependence.

Article information

Source
Bayesian Anal., Volume 14, Number 2 (2019), 553-572.

Dates
First available in Project Euclid: 10 August 2018

Permanent link to this document
https://projecteuclid.org/euclid.ba/1533866669

Digital Object Identifier
doi:10.1214/18-BA1116

Mathematical Reviews number (MathSciNet)
MR3934097

Zentralblatt MATH identifier
07045442

Keywords
nonparametric Bayes variable selection Markov random field spatio-temporal data

Rights
Creative Commons Attribution 4.0 International License.

Citation

Cassese, Alberto; Zhu, Weixuan; Guindani, Michele; Vannucci, Marina. A Bayesian Nonparametric Spiked Process Prior for Dynamic Model Selection. Bayesian Anal. 14 (2019), no. 2, 553--572. doi:10.1214/18-BA1116. https://projecteuclid.org/euclid.ba/1533866669


Export citation

References

  • Airoldi, E., Costa, T., Bassetti, F., Guindani, M., and Leisen, F. (2014). “Generalized Species Sampling Priors with Latent Beta reinforcements.” Journal of the American Statistical Association, 109(508): 1466–1480.
  • Amorós, R., Conesa, D., Martinez-Beneito, M., and López-Quílez, A. (2015). “Statistical methods for detecting the onset of influenza outbreaks: a review.” Revstat – Statistical Journal, 13(1): 41–62.
  • Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2014). Hierarchical modeling and analysis for spatial data. Boca Raton, London: Chapman & Hall/CRC, 2nd edition.
  • Bassetti, F., Crimaldi, I., and Leisen, F. (2010). “Conditionally identically distributed species sampling sequences.” Advances in Applied Probability, 42(2): 433–459.
  • Berti, P., Pratelli, L., and Rigo, P. (2004). “Limit Theorems for a Class of Identically Distributed Random Variables.” The Annals of Probability, 32(3): 2029–2052.
  • Canale, A., Lijoi, A., Nipoti, B., and Prünster, I. (2017). “On the Pitman–Yor process with spike and slab base measure.” Biometrika, 104(3): 681–697.
  • Cassese, A., Zhu, W., Guindani, M., and Vannucci, M. (2018). “Supplementary Material to “A Bayesian Nonparametric Spiked Process Prior for Dynamic Model Selection”.” Bayesian Analysis.
  • Conesa, D., López-Quílez, A., Martínez-Beneito, M. A., Miralles, M. T., and Verdejo, F. (2009). “FluDetWeb: an interactive web-based system for the early detection of the onset of influenza epidemics.” BMC Medical Informatics and Decision Making, 9(1): 36.
  • Do, K., Müller, P., and Tang, F. (2005). “A Bayesian mixture model for differential gene expression.” Journal of the Royal Statistical Society, Series C, 54(3): 627–644.
  • Efron, B. (2004). “Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis.” Journal of the American Statistical Association, 99(465): 96–104.
  • Efron, B. (2008). “Microarrays, Empirical Bayes and the Two-Groups Model.” Statistical Science, 23(1): 1–22.
  • Ferguson, T. S. (1983). “Bayesian density estimation by mixtures of normal distributions.” Recent Advances in Statistics, 24(1983): 287–302.
  • Fortini, S., Petrone, S., and Sporysheva, P. (2017). “On a notion of partially conditionally identically distributed sequences.” Stochastic Processes and their Applications, 128(3): 819–846.
  • Geweke, J. (1992). “Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments.” In Bayesian Statistics, 169–193. University Press.
  • Guindani, M., Sepúlveda, N., Paulino, C. D., and Müller, P. (2014). “A Bayesian semiparametric approach for the differential analysis of sequence counts data.” Journal of the Royal Statistical Society: Series C (Applied Statistics), 63(3): 385–404.
  • Heaton, M., Banks, D., Zou, J., Karr, A., Datta, G., Lynch, J., and Vera, F. (2012). “A spatio-temporal absorbing state model for disease and syndromic surveillance.” Statistics in Medicine, 31(19): 2123–2136.
  • Heidelberger, P. and Welch, P. D. (1981). “A spectral method for confidence interval generation and run length control in simulations.” Communications of the ACM, 24(4): 233–245.
  • Jo, S., Lee, J., Müller, P., Quintana, F. A., and Trippa, L. (2017). “Dependent Species Sampling Models for Spatial Density Estimation.” Bayesian Analysis, 12(2): 379–406.
  • Kim, S., Dahl, D. B., and Vannucci, M. (2009). “Spiked Dirichlet process prior for Bayesian multiple hypothesis testing in random effects models.” Bayesian Analysis, 4(4): 707–732.
  • Li, F., Zhang, T., Wang, Q., Gonzalez, M., Maresh, E., and Coan, J. (2015). “Spatial Bayesian variable selection and grouping for high-dimensional scalar-on-image regression.” The Annals of Applied Statistics, 9(12): 687–713.
  • Madigan, D. (2005). “Bayesian data mining for health surveillance.” In Spatial and Syndromic Surveillance for Public Health, 203–221. John Wiley & Sons, Ltd.
  • Martínez-Beneito, M., Conesa, D., López-Quílez, A., and López-Maside, A. (2008). “Bayesian Markov switching models for the early detection of influenza epidemics.” Statistics in Medicine, 27(22): 4455–4468.
  • Muscatello, D., Morton, P., Evans, I., and Gilmour, R. (2008). “Prospective surveillance of excess mortality due to influenza in New South Wales: feasibility and statistical approach.” Comunicable Diseases Intelligence, 32(4): 435–442.
  • Newton, M. A., Noueiry, A., Sarkar, D., and Ahlquist, P. (2004). “Detecting differential gene expression with a semiparametric hierarchical mixture method.” Biostatistics, 5(2): 155–176.
  • Scarpa, B. and Dunson, D. B. (2009). “Bayesian Hierarchical Functional Data Analysis Via Contaminated Informative Priors.” Biometrics, 65(3): 772–780.
  • Sun, W., Reich, B. J., Tony Cai, T., Guindani, M., and Schwartzman, A. (2015). “False discovery control in large-scale spatial multiple testing.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(1): 59–83.
  • Zhang, L., Guindani, M., Versace, F., and Vannucci, M. (2014). “A spatio-temporal nonparametric Bayesian variable selection model of fMRI data for clustering correlated time courses.” NeuroImage, 95: 162–175.
  • Zou, J., Karr, A., Banks, D., Heaton, M., Datta, G., Lynch, J., and Vera, F. (2012). “Bayesian methodology for the analysis of spatial–temporal surveillance data.” Statistical Analysis and Data Mining: The ASA Data Science Journal, 5(3): 194–204.

Supplemental materials