Electronic Journal of Statistics

Spatial models for point and areal data using Markov random fields on a fine grid

Christopher J. Paciorek

Full-text: Open access

Abstract

I consider the use of Markov random fields (MRFs) on a fine grid to represent latent spatial processes when modeling point-level and areal data, including situations with spatial misalignment. Point observations are related to the grid cell in which they reside, while areal observations are related to the (approximate) integral over the latent process within the area of interest. I review several approaches to specifying the neighborhood structure for constructing the MRF precision matrix, presenting results comparing these MRF representations analytically, in simulations, and in two examples. The results provide practical guidance for choosing a spatial process representation and highlight the importance of this choice. In particular, the results demonstrate that, and explain why, standard CAR models can behave strangely for point-level data. They show that various neighborhood weighting approaches based on higher-order neighbors that have been suggested for MRF models do not produce smooth fields, which raises doubts about their utility. Finally, they indicate that an MRF that approximates a thin plate spline compares favorably to standard CAR models and to kriging under many circumstances.

Article information

Source
Electron. J. Statist. Volume 7 (2013), 946-972.

Dates
Received: May 2012
First available in Project Euclid: 15 April 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1366031046

Digital Object Identifier
doi:10.1214/13-EJS791

Mathematical Reviews number (MathSciNet)
MR3044505

Zentralblatt MATH identifier
1337.62302

Keywords
Conditional autoregressive models Gaussian processes spatial smoothing thin plate splines

Citation

Paciorek, Christopher J. Spatial models for point and areal data using Markov random fields on a fine grid. Electron. J. Statist. 7 (2013), 946--972. doi:10.1214/13-EJS791. https://projecteuclid.org/euclid.ejs/1366031046


Export citation

References

  • Banerjee, S., Finley, A.O., Waldmann, P., and Ericsson, T., Hierarchical spatial process models for multiple traits in large genetic trials., Journal of the American Statistical Association, 105(490):506–521, 2010.
  • Banerjee, S., Gelfand, A.E., Finley, A.O., and Sang, H., Gaussian predictive process models for large spatial data sets., Journal of the Royal Statistical Society, Series B, 70(4):825–848, 2008.
  • Banerjee, S., Gelfand, A.E., and Sirmans, C.F., Directional rates of change under spatial process models., Journal of the American Statistical Association, 98(464):946–954, 2003.
  • Besag, J. and Mondal, D., First-order intrinsic autoregressions and the de Wijs process., Biometrika, 92(4):909–920, 2005.
  • Breslow, N.E. and Clayton, D.G., Approximate inference in generalized linear mixed models., Journal of the American Statistical Association, 88:9–25, 1993.
  • Christensen, O.F., Roberts, G.O., and Sköld, M., Robust Markov chain Monte Carlo methods for spatial generalized linear mixed models., Journal of Computational and Graphical Statistics, 15:1–17, 2006.
  • Christensen, Ole F. and Waagepetersen, Rasmus, Bayesian prediction of spatial count data using generalized linear mixed models., Biometrics, 58:280–286, 2002.
  • Diggle, P.J., Menezes, R., and Su, T.-L., Geostatistical inference under preferential sampling., Journal of the Royal Statistical Society, Series C, 59:191–232, 2010.
  • Fuentes, M. and Raftery, A.E., Model evaluation and spatial interpolation by Bayesian combination of observations with outputs from numerical models., Biometrics, 61(1):36–45, 2005.
  • Furrer, R., Genton, M.G., and Nychka, D., Covariance tapering for interpolation of large spatial datasets., Journal of Computational and Graphical Statistics, 15:502–523, 2006.
  • Gamerman, D., Sampling from the posterior distribution in generalized linear mixed models., Statistics and Computing, 7:57–68, 1997.
  • Hrafnkelsson, B. and Cressie, N., Hierarchical modeling of count data with application to nuclear fall-out., Environmental and Ecological Statistics, 10(2):179–200, 2003.
  • Hund, L., Chen, J.T., Krieger, N., and Coull, B.A., A geostatistical approach to large-scale disease mapping with temporal misalignment., Biometrics, 68:849–858, 2012.
  • Kammann, E.E. and Wand, M.P., Geoadditive models., Journal of the Royal Statistical Society, Series C, 52:1–18, 2003.
  • Kaufman, C., Schervish, M., and Nychka, D., Covariance tapering for likelihood-based estimation in large spatial datasets., Journal of the American Statistical Association, 103 :1556–1569, 2008.
  • Kelsall, J. and Wakefield, J., Modeling spatial variation in disease risk., Journal of the American Statistical Association, 97(459):692–701, 2002.
  • Krieger, N., Chen, J.T., Waterman, P.D., Rehkopf, D.H., Yin, R., and Coull, B.A., Race/ethnicity and changing US socioeconomic gradients in breast cancer incidence: California and Massachusetts, 1978–2002 (United States)., Cancer Causes and Control, 17(2):217–226, 2006.
  • Lindgren, F., Rue, H., and Lindström, J., An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach., Journal of the Royal Statistical Society, Series B, 73:423–498, 2011.
  • Paciorek, C.J., Bayesian smoothing with Gaussian processes using Fourier basis functions in the spectralGP package., Journal of Statistical Software, 19:2, 2007.
  • Paciorek, C.J., Supplement to “Spatial models for point and areal data using Markov random fields on a fine grid.”, 2013. DOI:, 10.1214/13-EJS791SUPP.
  • Paciorek, C.J. and Liu, Y., Assessment and statistical modeling of the relationship between remotely-sensed aerosol optical depth and $\mboxPM_2.5$. Technical Report 167, Health Effects Institute Research Report (peer-reviewed), 2012.
  • Paciorek, C.J., Yanosky, J.D., Puett, R.C., Laden, F., and Suh, H.H., Practical large-scale spatio-temporal modeling of particulate matter concentrations., Annals of Applied Statistics, 3:369–396, 2009. 10.1214/08-AOAS204
  • Pettitt, A.N., Weir, I.S., and Hart, A.G., A conditional autoregressive Gaussian process for irregularly spaced multivariate data with application to modelling large sets of binary data., Statistics and Computing, 12(4):353–367, 2002.
  • Rue, H. and Held, L., Gaussian Markov Random Fields: Theory and Applications. Chapman & Hall, Boca Raton, 2005.
  • Rue, H., Martino, S., and Chopin, N., Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations., Journal of the Royal Statistical Society, Series B, 71(2):319–392, 2009.
  • Sang, H. and Huang, J.Z., A full scale approximation of covariance functions for large spatial data sets., Journal of the Royal Statistical Society, Series B, 74:111–132, 2012.
  • Silverman, B.W., Spline smoothing: the equivalent variable kernel method., Annals of Statistics, 12(3):898–916, 1984.
  • Sollich, P. and Williams, C.K.I., Using the equivalent kernel to understand Gaussian process regression. In, Advances in Neural Information Processing Systems 17, pages 1313–1320. MIT Press, 2005.
  • Song, H.R., Fuentes, M., and Ghosh, S., A comparative study of Gaussian geostatistical models and Gaussian Markov random field models., Journal of Multivariate Analysis, 99(8) :1681–1697, 2008.
  • Stein, M.L., Interpolation of Spatial Data: Some Theory for Kriging. Springer, N.Y., 1999.
  • Stein, M.L., Chi, Z., and Welty, L.J., Approximating likelihoods for large spatial data sets., Journal of the Royal Statistical Society, Series B, 66(2):275–296, 2004.
  • Stein, M.L. and Fang, D., Discussion of ozone exposure and population density in Harris County, Texas, by R.J. Carroll et al., Journal of the American Statistical Association, 92:408–411, 1997.
  • Wall, M.M., A close look at the spatial structure implied by the CAR and SAR models., Journal of Statistical Planning and Inference, 121(2):311–324, 2004.
  • White, G. and Ghosh, S.K., A stochastic neighborhood conditional autoregressive model for spatial data., Computational Statistics and Data Analysis, 53(8) :3033–3046, 2009.
  • Wikle, C.K., Spatial modeling of count data: A case study in modelling breeding bird survey data on large spatial domains. In A.B. Lawson and D.G.T. Denison, editors, Spatial Cluster Modelling, pages 199–209. Chapman & Hall, 2002.
  • Wolfinger, R. and O’Connell, M., Generalized linear mixed models: A pseudo-likelihood approach., Journal of Statistical Computation and Simulation, 48:233–243, 1993.
  • Yue, Y. and Speckman, P.L., Nonstationary spatial Gaussian Markov random fields., Journal of Computational and Graphical Statistics, 19:96–116, 2010.
  • Zhu, J., Huang, H.C., and Reyes, P.E., On selection of spatial linear models for lattice data., Journal of the Royal Statistical Society, Series B, 72(3):389–402, 2010.

Supplemental materials