The Annals of Applied Statistics

Beta regression for time series analysis of bounded data, with application to Canada Google® Flu Trends

Annamaria Guolo and Cristiano Varin

Full-text: Open access

Abstract

Bounded time series consisting of rates or proportions are often encountered in applications. This manuscript proposes a practical approach to analyze bounded time series, through a beta regression model. The method allows the direct interpretation of the regression parameters on the original response scale, while properly accounting for the heteroskedasticity typical of bounded variables. The serial dependence is modeled by a Gaussian copula, with a correlation matrix corresponding to a stationary autoregressive and moving average process. It is shown that inference, prediction, and control can be carried out straightforwardly, with minor modifications to standard analysis of autoregressive and moving average models. The methodology is motivated by an application to the influenza-like-illness incidence estimated by the Google® Flu Trends project.

Article information

Source
Ann. Appl. Stat. Volume 8, Number 1 (2014), 74-88.

Dates
First available in Project Euclid: 8 April 2014

Permanent link to this document
http://projecteuclid.org/euclid.aoas/1396966279

Digital Object Identifier
doi:10.1214/13-AOAS684

Mathematical Reviews number (MathSciNet)
MR3191983

Zentralblatt MATH identifier
06302228

Keywords
Beta regression bounded time series Gaussian copula Google® Flu Trends surveillance

Citation

Guolo, Annamaria; Varin, Cristiano. Beta regression for time series analysis of bounded data, with application to Canada Google ® Flu Trends. Ann. Appl. Stat. 8 (2014), no. 1, 74--88. doi:10.1214/13-AOAS684. http://projecteuclid.org/euclid.aoas/1396966279.


Export citation

References

  • Butler, D. (2013). When Google got flu wrong. Nature 494 155–156.
  • Casarin, R., Dalla Valle, L. and Leisen, F. (2012). Bayesian model selection for beta autoregressive processes. Bayesian Anal. 7 385–409.
  • Cox, D. R. (1981). Statistical analysis of time series: Some recent developments. Scand. J. Stat. 8 93–115.
  • Cribari-Neto, F. and Zeileis, A. (2010). Beta regression in R. Journal of Statistical Software 34 1–24.
  • da Silva, C. Q., Migon, H. S. and Correia, L. T. (2011). Dynamic Bayesian beta models. Comput. Statist. Data Anal. 55 2074–2089.
  • Da-Silva, C. Q. and Migon, H. S. (2012). Hierarchical dynamic beta model. Technical Report 253. Dept. Statistics, Federal Univ. Rio de Janeiro.
  • Dunn, P. K. and Smyth, G. K. (1996). Randomized quantile residuals. J. Comput. Graph. Statist. 5 236–244.
  • Ferrari, S. L. P. and Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. J. Appl. Stat. 31 799–815.
  • Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S. and Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature 457 1012–1014.
  • Grün, B., Kosmidis, I. and Zeileis, A. (2012). Extended beta regression in R: Shaken, stirred, mixed, and partitioned. Journal of Statistical Software 48 1–25.
  • Guolo, A. and Varin, C. (2013). Supplement to “Beta regression for time series analysis of bounded data, with application to Canada Google® Flu Trends.” DOI:10.1214/13-AOAS684SUPP.
  • Hutwagner, L., Thompson, W. W., Seeman, G. M. and Treadwell, T. (2003). The bioterrorism preparedness and response early aberration reporting system (EARS). Journal of Urban Health 80 89–96.
  • Kieschnick, R. and McCullough, B. D. (2003). Regression analysis of variates observed on $(0,1)$: Percentages, proportions and fractions. Stat. Model. 3 193–213.
  • Love, T. M. T., Thurson, S. W., Keefer, M. C., Dewhurst, S. and Lee, H. Y. (2010). Mathematical modeling of ultradeep sequencing data reveals that acute CD8+ T-lymphocyte responses exert strong selective pressure in simian immunodeficiency virus-infected macaques but still fail to clear founder epitope sequences. Journal of Virology 84 5802–5814.
  • Masarotto, G. and Varin, C. (2012). Gaussian copula marginal regression. Electron. J. Stat. 6 1517–1549.
  • Montgomery, D. C. (2009). Introduction to Statistical Quality Control, 6th ed. Wiley, New York.
  • Ospina, R. and Ferrari, S. L. P. (2012). A general class of zero-or-one inflated beta regression models. Comput. Statist. Data Anal. 56 1609–1623.
  • Paolino, P. (2001). Maximum likelihood estimation of models with beta-distributed dependent variables. Political Analysis 9 325–346.
  • R Core Team. (2013). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. Available at http://www.R-project.org/.
  • Rocha, A. V. and Cribari-Neto, F. (2009). Beta autoregressive moving average models. TEST 18 529–545.
  • Rogers, J. A., Polhamus, D., Gillespie, W. R., Ito, K., Romero, K., Qiu, R., Stephenson, D., Gastonguay, M. R. and Corrigan, B. (2012). Combining patient-level and summary-level data for Alzheimer’s disease modeling and simulation: A beta regression meta-analysis. J. Pharmacokinet. Pharmacodyn. 39 479–498.
  • Smithson, M. and Verkuilen, J. (2006). A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol. Methods 11 54–71.
  • Song, P. X. K. (2007). Correlated Data Analysis: Modeling, Analytics, and Applications. Springer, New York.
  • Stasinopoulos, D. M. and Rigby, R. A. (2007). Generalized additive models for location scale and shape (gamlss) in R. Journal of Statistical Software 23 1–46.
  • Unkel, S., Farrington, C. P., Garthwaite, P. H., Robertson, C. and Andrews, N. (2012). Statistical methods for the prospective detection of infectious disease outbreaks: A review. J. Roy. Statist. Soc. Ser. A 175 49–82.
  • Wang, X.-F. (2012). Joint generalized models for multidimensional outcomes: A case study of neuroscience data from multimodalities. Biom. J. 54 264–280.
  • Wang, W., Scharfstein, D., Wang, C., Daniels, M., Needham, D. and Brower, R. (2011). Estimating the causal effect of low tidal volume ventilation on survival in patients with acute lung injury. J. R. Stat. Soc. Ser. C. Appl. Stat. 60 475–496.
  • Woodall, W. (2006). The use of control chart in health-care and public-health surveillance. Journal of Quality Technology 38 89–104.
  • Zou, K. H., Carlsson, M. O. and Quinn, S. A. (2010). Beta-mapping and beta-regression for changes of ordinal-rating measurements on Likert scales: A comparison of the change scores among multiple treatment groups. Stat. Med. 29 2486–2500.

Supplemental materials