The Annals of Applied Statistics

Smoothed ANOVA with spatial effects as a competitor to MCAR in multivariate spatial smoothing

Yufen Zhang, James S. Hodges, and Sudipto Banerjee

Full-text: Open access

Abstract

Rapid developments in geographical information systems (GIS) continue to generate interest in analyzing complex spatial datasets. One area of activity is in creating smoothed disease maps to describe the geographic variation of disease and generate hypotheses for apparent differences in risk. With multiple diseases, a multivariate conditionally autoregressive (MCAR) model is often used to smooth across space while accounting for associations between the diseases. The MCAR, however, imposes complex covariance structures that are difficult to interpret and estimate. This article develops a much simpler alternative approach building upon the techniques of smoothed ANOVA (SANOVA). Instead of simply shrinking effects without any structure, here we use SANOVA to smooth spatial random effects by taking advantage of the spatial structure. We extend SANOVA to cases in which one factor is a spatial lattice, which is smoothed using a CAR model, and a second factor is, for example, type of cancer. Datasets routinely lack enough information to identify the additional structure of MCAR. SANOVA offers a simpler and more intelligible structure than the MCAR while performing as well. We demonstrate our approach with simulation studies designed to compare SANOVA with different design matrices versus MCAR with different priors. Subsequently a cancer-surveillance dataset, describing incidence of 3-cancers in Minnesota’s 87 counties, is analyzed using both approaches, showing the competitiveness of the SANOVA approach.

Article information

Source
Ann. Appl. Stat., Volume 3, Number 4 (2009), 1805-1830.

Dates
First available in Project Euclid: 1 March 2010

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1267453965

Digital Object Identifier
doi:10.1214/09-AOAS267

Mathematical Reviews number (MathSciNet)
MR2752159

Zentralblatt MATH identifier
1184.62126

Keywords
Analysis of variance Bayesian inference conditionally autoregressive model hierarchical model smoothing

Citation

Zhang, Yufen; Hodges, James S.; Banerjee, Sudipto. Smoothed ANOVA with spatial effects as a competitor to MCAR in multivariate spatial smoothing. Ann. Appl. Stat. 3 (2009), no. 4, 1805--1830. doi:10.1214/09-AOAS267. https://projecteuclid.org/euclid.aoas/1267453965


Export citation

References

  • Banerjee, S., Carlin, B. P. and Gelfand, A. E. (2004). Hierarchical Modeling and Analysis for Spatial Data. Chapman & Hall/CRC Press, Boca Raton, FL.
  • Banerjee, S., Wall, M. M. and Carlin, B. P. (2003). Frailty modeling for spatially correlated survival data, with application to infant mortality in Minnesota. Biostatistics 4 123–142.
  • Baron, A. E., Franceschi, S., Barra, S., Talamini, R. and La Vecchia, C. (1993). Comparison of the joint effect of alcohol and smoking on the risk of cancer across sites in the upper aerodigestive tract. Cancer Epidemiol. Biomarkers Prev. 2 519–523.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192–236.
  • Besag, J., Green, P., Higdon, D. and Mengersen, K. (1995). Bayesian computation and stochastic systems (with discussion). Statist. Science 10 3–66.
  • Besag, J., York, J. C. and Mollié, A. (1991). Bayesian image restoration, with two applications in spatial statistics (with discussion). Ann. Inst. of Statist. Math. 43 1–59.
  • Carlin, B. P. and Banerjee, S. (2003). Hierarchical multivariate CAR models for spatio-temporally correlated survival data. In Bayesian Statistics 7 (J. M. Bernardo et al., eds.) 45–64. Oxford Univ. Press, Oxford.
  • Elliott, P., Wakefield, J. C., Best, N. G. and Briggs, D. J. (2000). Spatial Epidemiology: Methods and Applications. Oxford Univ. Press, Oxford.
  • Gelfand, A. E. and Vounatsou, P. (2003). Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics 4 11–25.
  • Gelman, A., Carlin, J. B. Stern, H. S. and Rubin, D. B. (2004). Bayesian Data Analysis, 2nd ed. Chapman and Hall/CRC Press, Boca Raton, FL.
  • Hodges, J. S. (1998). Some algebra and geometry for hierarchical models. J. Roy. Statist. Soc. Ser. B 60 497–536.
  • Hodges, J. S., Carlin, B. P. and Fan, Q. (2003). On the precision of the conditionally autoregressive prior in spatial models. Biometrics 59 317–322.
  • Hodges, J. S., Cui, Y., Sargent, D. J. and Carlin, B. P. (2007). Smoothing balanced single-error-term analysis of variance. Technometrics 49 12–25.
  • Jin, X., Banerjee, S. and Carlin, B. P. (2007). Order-free coregionalized lattice models with application to multiple disease mapping. J. Roy. Statist. Soc. Ser. B 69 817–838.
  • Jin, X., Carlin, B. P. and Banerjee, S. (2005). Generalized hierarchical multivariate CAR models for areal data. Biometrics 61 950–961.
  • Kim, H., Sun, D. and Tsutakawa, R. K. (2001). A bivariate Bayes method for improving the estimates of mortality rates with a twofold conditional autoregressive model. J. Amer. Statist. Assoc. 96 1506–1521.
  • Lawson, A. B., Biggeri, A. B., Bohning, D., Lesaffre, E., Viel, J. F. and Bertollini, R. (1999). Disease Mapping and Risk Assessment for Public Health. Wiley, New York.
  • Lee, Y. and Nelder, J. A. (1996). Hierarchical generalized linear models (with discussion). J. Roy. Statist. Soc. Ser. B 58 619–673.
  • Li, Y. and Ryan, L. (2002). Modeling spatial survival data using semi-parametric frailty models. Biometrics 58 287–297.
  • Lichstein, J. W., Simons, T. R., Shriner, S. A. and Franzreb, K. E. (2002). Spatial autocorrelation and autoregressive models in ecology. Ecological Monographs 72 445–463.
  • Mardia, K. V. (1988). Multi-dimensional multivariate Gaussian Markov random fields with application to image processing. J. Multivar. Anal. 24 265–284.
  • Ramsay, T., Burnett, R. and Krewski, D. (2003). Exploring bias in a generalized additive model for spatial air pollution data. Environ. Health Perspect. 111 1283–1288.
  • Reich, B. J., Hodges, J. S. and Zadnik, V. (2006). Effects of residual smoothing on the posterior of the fixed effects in disease-mapping models. Biometrics 62 1197–1206.
  • Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall/CRC, Boca Raton.
  • Sain, S. R. and Cressie, N. (2002). Multivariate lattice models for spatial environmental data. In Proceedings of ASA Section on Statistics and the Environment 2820–2825. Amer. Statist. Assoc., Alexandria, VA.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit (with discussion). J. Roy. Statist. Soc. Ser. B 64 583–639.
  • Turechek, W. W. and Madden, L. V. (2002). A generalized linear modeling approach for characterizing disease incidence in spatial hierarchy. Phytopathology 93 458–466.
  • Wakefield, J. (2007). Disease mapping and spatial regression with count data. Biostatistics 8 158–183.
  • Wall, M. M. (2004). A close look at the spatial structure implied by the CAR and SAR models. J. Statist. Plann. Inference 121 311–324.
  • Waller, L. A. and Gotway, C. A. (2004). Applied Spatial Statistics for Public Health Data. Wiley, New York.
  • Zhang, Y., Hodges, J. S. and Banerjee, S. (2009). Supplement to “Smoothed ANOVA with spatial effects as a competitor to MCAR in multivariate spatial smoothing.” DOI:10.1214/09-AOAS267SUPP.

Supplemental materials