Open Access
Translator Disclaimer
March 2018 Phenomenological forecasting of disease incidence using heteroskedastic Gaussian processes: A dengue case study
Leah R. Johnson, Robert B. Gramacy, Jeremy Cohen, Erin Mordecai, Courtney Murdock, Jason Rohr, Sadie J. Ryan, Anna M. Stewart-Ibarra, Daniel Weikel
Ann. Appl. Stat. 12(1): 27-66 (March 2018). DOI: 10.1214/17-AOAS1090


In 2015 the US federal government sponsored a dengue forecasting competition using historical case data from Iquitos, Peru and San Juan, Puerto Rico. Competitors were evaluated on several aspects of out-of-sample forecasts including the targets of peak week, peak incidence during that week, and total season incidence across each of several seasons. Our team was one of the winners of that competition, outperforming other teams in multiple targets/locales. In this paper we report on our methodology, a large component of which, surprisingly, ignores the known biology of epidemics at large—for example, relationships between dengue transmission and environmental factors—and instead relies on flexible nonparametric nonlinear Gaussian process (GP) regression fits that “memorize” the trajectories of past seasons, and then “match” the dynamics of the unfolding season to past ones in real-time. Our phenomenological approach has advantages in situations where disease dynamics are less well understood, or where measurements and forecasts of ancillary covariates like precipitation are unavailable, and/or where the strength of association with cases are as yet unknown. In particular, we show that the GP approach generally outperforms a more classical generalized linear (autoregressive) model (GLM) that we developed to utilize abundant covariate information. We illustrate variations of our method(s) on the two benchmark locales alongside a full summary of results submitted by other contest competitors.


Download Citation

Leah R. Johnson. Robert B. Gramacy. Jeremy Cohen. Erin Mordecai. Courtney Murdock. Jason Rohr. Sadie J. Ryan. Anna M. Stewart-Ibarra. Daniel Weikel. "Phenomenological forecasting of disease incidence using heteroskedastic Gaussian processes: A dengue case study." Ann. Appl. Stat. 12 (1) 27 - 66, March 2018.


Received: 1 May 2017; Revised: 1 August 2017; Published: March 2018
First available in Project Euclid: 9 March 2018

zbMATH: 06894698
MathSciNet: MR3773385
Digital Object Identifier: 10.1214/17-AOAS1090

Keywords: dengue fever , epidemiology , Gaussian process , generalized linear (autoregressive) model , heteroskedastic modeling , latent variable

Rights: Copyright © 2018 Institute of Mathematical Statistics


Vol.12 • No. 1 • March 2018
Back to Top