Bayesian Analysis

Bayesian multivariate areal wombling for multiple disease boundary analysis

Bradley P. Carlin and Haijun Ma

Full-text: Open access


Multivariate data summarized over areal units (counties, zip codes, etc.) are common in the field of public health. Estimation or testing of geographic boundaries for such data may have varied goals. For example, for data on multiple disease outcomes, we may be interested in a single set of "composite" boundaries for all diseases, separate boundaries for each disease, or both. Different areal wombling (boundary analysis) techniques are needed to meet these different requirements. But in any case, the underlying statistical model needs to account for correlations across both diseases and locations. Utilizing recent developments in multivariate conditionally autoregressive (MCAR) distributions and spatial structural equation modeling, we suggest a variety of Bayesian hierarchical models for multivariate areal boundary analysis, including some that incorporate random neighborhood structure. Many of our models can be implemented via standard software, namely WinBUGS for posterior sampling and $R$ for summarization and plotting. We illustrate our methods using Minnesota county-level esophagus, larynx, and lung cancer data, comparing models that account for both, only one, or neither of the aforementioned correlations. We identify both composite and cancer-specific boundaries, selecting the best statistical model using the DIC criterion. Our results indicate primary boundaries in both the composite and cancer-specific response surface separating the mining- and tourism-oriented northeast counties from the remainder of the state, as well as secondary (residual) boundaries in the Twin Cities metro area.

Article information

Bayesian Anal., Volume 2, Number 2 (2007), 281-302.

First available in Project Euclid: 22 June 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Areal data Cancer Multivariate conditionally autoregressive (MCAR) model Surveillance, Epidemiology and End Results (SEER) data


Ma, Haijun; Carlin, Bradley P. Bayesian multivariate areal wombling for multiple disease boundary analysis. Bayesian Anal. 2 (2007), no. 2, 281--302. doi:10.1214/07-BA211.

Export citation


  • Banerjee, S., Carlin, B., and Gelfand, A. (2004). Hierarchical Modeling and Analysis for Spatial Data. Boca Raton, FL: Chapman and Hall/CRC Press.
  • Baron, A. E., S. Franceschi and, S. B., Talamini, R., and Vecchia, C. L. (1993). "Comparison of the joint effect of alcohol and smoking on the risk of cancer across sites in the upper aerodigestive tract." Cancer Epidemiology Biomarkers and Prevention, 2: 519–523.
  • Besag, J. (1974). "Spatial interaction and the statistical analysis of lattice systems (with discussion)." Journal of the Royal Statistical Society - Series B, 36: 192–236.
  • Besag, J., York, J., and Mollie, A. (1991). "Bayesian image restoration, with two applications in spatial statistics (with discussion)." Annals of the Institute of Statistical Mathematics, 43: 1–59.
  • Best, N. G., Richardson, S., and Thomson, A. (2005). "A comparison of Bayesian spatial models for disease mapping." Statistical Methods in Medical Research, 43: 35–59.
  • Carlin, B. P. and Banerjee, S. (2003). "Hierarchical multivariate CAR models for spatio-temporally correlated survival data (with discussion) in Bayesian Statistics 7." In Bayesian Statistics, volume 9, 45–63. Oxford: Oxford University Press.
  • Cressie, N. A. C. (1993). Statistics for Spatial Data, 2nd ed.. New York: Wiley.
  • Csillag, F., Boots, B., Fortin, M.-J., Lowell, K., and Potvin, F. (2001). "Multiscale characterization of boundaries and landscape ecological patterns." Geomatica, 55: 291–307.
  • Dass, S. C. and Nair, V. N. (2003). "Edge detection, spatial smoothing, and image restoration with partially observed multivariate data." Journal of the American Statistical Association, 98: 77–89.
  • Elliot, P. and Best, N. G. (1998). "Geographical patterns of disease." In Encyclopedia of Biostatistics (eds. P. Armitage and T. Colton), volume 9. London: Wiley.
  • Gelfand, A. E. and Vounatsou, P. (2003). "Proper multivariate conditional autoregressive models for spatial data analysis." Biostatistics, 4: 11–25.
  • Geman, S. and Geman, D. (1984). "Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images." IEEE Transactions on Pattern Analysis and Machine Intelligence, 6: 721–742.
  • Held, L., Natário, I., Fenton, S. E., Rue, H., and Becker, N. (2005). "Towards joint disease mapping." Statistical Methods in Medical Research, 14: 61–82.
  • Hogan, J. W. and Tchernis, R. (1999). "Bayesian factor analysis for spatially correlated data, with application to summerizing area-level material deprivation from census data." Journal of the American Statistical Association, 99: 314–324.
  • Jacquez, G. M. and Greiling, D. A. (2003). "Geographic boundaries in breast, lung and colorectal cancers in relation to exposure to air toxics in Long Island, New York." International Journal of Health Geographics, 2: paper no. 4.
  • Jeng, F. C. and Woods, J. W. (1991). "Compound Gauss-Markov random fields for image estimation." IEEE Transactions in Signal Processing, 39: 683–691.
  • Jin, X., Carlin, B. P., and Banerjee, S. (1991). "Generalized hierarchical multivariate CAR models for areal data." Biometrics, 61: 950–961.
  • Kim, H., Sun, D., and Tsutakawa, R. K. (2001). "A bivariate Bayes method for improving the estimates of mortality rates with a twofold conditional autoregressive model." Journal of the American Statistical Association, 96: 1506–1521.
  • Knorr-Held, L. and Best, N. G. (2001). "A shared component model for detecting joint and selective clustering of two diseases." Journal of the Royal Statistical Society - Series A, 164: 73–85.
  • Liu, X., Wall, M., and Hodges, J. (2005). "Generalized spatial structural equation models." Biostatistics, 6: 539–557.
  • Lu, H. and Carlin, B. P. (2005). "Bayesian areal wombling for geographical boundary analysis." Geographical Analysis, 37: 265–285.
  • Lu, H., Reilly, C., Banerjee, S., and Carlin, B. P. (2006). "Bayesian areal wombling via adjacency modeling." Environmental and Ecological Statistics.
  • Ma, H., Carlin, B. P., and Banerjee, S. (2006). "Hierarchical and joint site-edge methods for Medicare hospice service region boundary analysis." Research Report 2006–010, Division of Biostatistics, University of Minnesota.
  • Mardia, K. V. (1988). "Multi-dimensional multivariate Gaussian Markov random fields with application to image processing." Journal of Multivariate Analysis, 24: 265–284.
  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and van der Linde, A. (2002). "Bayesian measures of model complexity and fit (with discussion)." Journal of the Royal Statistical Society - Series B, 64: 583–639.
  • Wang, F. and Wall, M. (2003). "Generalized common spatial factor model." Biostatistics, 4: 569–582.
  • Womble, W. H. (1951). "Differential systematics." Science, 114: 315–322.