The Annals of Statistics

Goodness of fit tests for a class of Markov random field models

Mark S. Kaiser, Soumendra N. Lahiri, and Daniel J. Nordman

Full-text: Access denied (no subscription detected) We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

This paper develops goodness of fit statistics that can be used to formally assess Markov random field models for spatial data, when the model distributions are discrete or continuous and potentially parametric. Test statistics are formed from generalized spatial residuals which are collected over groups of nonneighboring spatial observations, called concliques. Under a hypothesized Markov model structure, spatial residuals within each conclique are shown to be independent and identically distributed as uniform variables. The information from a series of concliques can be then pooled into goodness of fit statistics. Under some conditions, large sample distributions of these statistics are explicitly derived for testing both simple and composite hypotheses, where the latter involves additional parametric estimation steps. The distributional results are verified through simulation, and a data example illustrates the method for model assessment.

Article information

Source
Ann. Statist. Volume 40, Number 1 (2012), 104-130.

Dates
First available: 15 March 2012

Permanent link to this document
http://projecteuclid.org/euclid.aos/1331830776

Digital Object Identifier
doi:10.1214/11-AOS948

Zentralblatt MATH identifier
06075609

Mathematical Reviews number (MathSciNet)
MR3013181

Subjects
Primary: 62F03: Hypothesis testing
Secondary: 62M30: Spatial processes

Keywords
Increasing domain asymptotics probability integral transform spatial processes spatial residuals

Citation

Kaiser, Mark S.; Lahiri, Soumendra N.; Nordman, Daniel J. Goodness of fit tests for a class of Markov random field models. The Annals of Statistics 40 (2012), no. 1, 104--130. doi:10.1214/11-AOS948. http://projecteuclid.org/euclid.aos/1331830776.


Export citation

References

  • [1] Anderson, T. W. (1993). Goodness of fit tests for spectral distributions. Ann. Statist. 21 830–847.
  • [2] Arnold, B. C., Castillo, E. and Sarabia, J. M. (1992). Conditionally Specified Distributions. Lecture Notes in Statistics 73. Springer, Berlin.
  • [3] Bai, J. (2003). Testing parametric conditional distributions of dynamic models. Rev. Econom. Statist. 85 531–549.
  • [4] Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192–236.
  • [5] Besag, J. and Higdon, D. (1999). Bayesian analysis of agricultural field experiments. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 691–746.
  • [6] Besag, J. and Kooperberg, C. (1995). On conditional and intrinsic autoregressions. Biometrika 82 733–746.
  • [7] Brockwell, A. E. (2007). Universal residuals: A multivariate transformation. Statist. Probab. Lett. 77 1473–1478.
  • [8] Caragea, P. C. and Kaiser, M. S. (2009). Autologistic models with interpretable parameters. J. Agric. Biol. Environ. Stat. 14 281–300.
  • [9] Cox, D. R. and Snell, E. J. (1971). On test statistics calculated from residuals. Biometrika 58 589–594.
  • [10] Cressie, N. A. C. (1993). Statistics for Spatial Data, 2nd ed. Wiley, New York.
  • [11] Csiszár, I. and Talata, Z. (2006). Consistent estimation of the basic neighborhood of Markov random fields. Ann. Statist. 34 123–145.
  • [12] Czado, C., Gneiting, T. and Held, L. (2009). Predictive model assessment for count data. Biometrics 65 1254–1261.
  • [13] Darling, D. A. (1957). The Kolmogorov–Smirnov, Cramér–von Mises tests. Ann. Math. Statist. 28 823–838.
  • [14] Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge Series in Statistical and Probabilistic Mathematics 1. Cambridge Univ. Press, Cambridge.
  • [15] Dawid, A. P. (1984). Statistical theory. The prequential approach. J. Roy. Statist. Soc. Ser. A 147 278–292.
  • [16] Diebold, F. X., Gunther, T. A. and Tay, A. S. (1998). Evaluating density forecasts with applications to financial risk management. Internat. Econom. Rev. 39 863–883.
  • [17] Durbin, J. (1973). Weak convergence of the sample distribution function when parameters are estimated. Ann. Statist. 1 279–290.
  • [18] Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statist. Sci. 7 457–511.
  • [19] Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 243–268.
  • [20] Guyon, X. (1995). Random Fields on a Network. Springer, New York.
  • [21] Guyon, X. and Yao, J.-F. (1999). On the underfitting and overfitting sets of models chosen by order selection criteria. J. Multivariate Anal. 70 221–249.
  • [22] Hammersley, J. M. and Clifford, P. (1971). Markov fields on finite graphs and lattices. Unpublished manuscript.
  • [23] Hardouin, C. and Yao, J.-F. (2008). Multi-parameter auto-models with applications to cooperative systems and analysis of mixed state data. Biometrika 95 335–349.
  • [24] Jager, L. and Wellner, J. A. (2007). Goodness-of-fit tests via phi-divergences. Ann. Statist. 35 2018–2053.
  • [25] Jensen, T. R. and Toft, B. (1995). Graph Coloring Problems. Wiley, New York.
  • [26] Ji, C. and Seymour, L. (1996). A consistent model selection procedure for Markov random fields based on penalized pseudolikelihood. Ann. Appl. Probab. 6 423–443.
  • [27] Justel, A., Peña, D. and Zamar, R. (1997). A multivariate Kolmogorov–Smirnov test of goodness of fit. Statist. Probab. Lett. 35 251–259.
  • [28] Kaiser, M. S. and Caragea, P. C. (2009). Exploring dependence with data on spatial lattices. Biometrics 65 857–865.
  • [29] Kaiser, M. S. and Cressie, N. (1997). Modeling Poisson variables with positive spatial dependence. Statist. Probab. Lett. 35 423–432.
  • [30] Kaiser, M. S. and Cressie, N. (2000). The construction of multivariate distributions from Markov random fields. J. Multivariate Anal. 73 199–220.
  • [31] Kaiser, M. S., Cressie, N. and Lee, J. (2002). Spatial mixture models based on exponential family conditional distributions. Statist. Sinica 12 449–474.
  • [32] Kaiser, M. S., Lahiri, S. N. and Nordman, D. J. (2011). Supplement to “Goodness of fit tests for a class of Markov random field models.” DOI:10.1214/11-AOS948SUPP.
  • [33] Khmaladze, È. V. (1981). A martingale approach in the theory of goodness-of-fit tests. Theory Probab. Appl. 26 240–257.
  • [34] Khmaladze, È. V. (1993). Goodness of fit problem and scanning innovation martingales. Ann. Statist. 21 798–829.
  • [35] Khmaladze, E. V. and Koul, H. L. (2004). Martingale transforms goodness-of-fit tests in regression models. Ann. Statist. 32 995–1034.
  • [36] Koul, H. L. (1970). A class of ADF tests for subhypothesis in the multiple linear regression. Ann. Math. Statist. 41 1273–1281.
  • [37] Koul, H. L. and Sakhanenko, L. (2005). Goodness-of-fit testing in regression: A finite sample comparison of bootstrap methodology and Khmaladze transformation. Statist. Probab. Lett. 74 290–302.
  • [38] Lahiri, S. N. (1999). Asymptotic distribution of the empirical spatial cumulative distribution function predictor and prediction bands based on a subsampling method. Probab. Theory Related Fields 114 55–84.
  • [39] Lahiri, S. N. (2003). Central limit theorems for weighted sums of a spatial process under a class of stochastic and fixed designs. Sankhyā Ser. A 65 356–388.
  • [40] Lahiri, S. N., Kaiser, M. S., Cressie, N. and Hsu, N.-J. (1999). Prediction of spatial cumulative distribution functions using subsampling. J. Amer. Statist. Assoc. 94 86–110. With comments and a rejoinder by the authors.
  • [41] Rosenblatt, M. (1952). Remarks on a multivariate transformation. Ann. Math. Statist. 23 470–472.
  • [42] Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. Monographs on Statistics and Applied Probability 104. Chapman and Hall/CRC, Boca Raton, FL.
  • [43] Sherman, M. and Carlstein, E. (1994). Nonparametric estimation of the moments of a general statistic computed from spatial data. J. Amer. Statist. Assoc. 89 496–500.
  • [44] Smith, R. L. (1999). Discussion of “Bayesian analysis of agricultural field experiments,” by J. Besag and D. Higdon. J. Roy. Statist. Soc. Ser. B 61 725–727.
  • [45] Speed, T. P. (1978). Relations between models for spatial data, contingency tables and Markov fields on graphs. Suppl. Adv. Appl. Probab. 10 111–122.
  • [46] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.

Supplemental materials

  • Supplementary material: Proofs of main results for spatial GOF test statistics. A supplement [32] provides proofs of all asymptotic distributional results from Section 4, regarding the conclique-based spatial GOF test statistics in simple and composite null hypothesis settings (Proposition 4.1, Theorem 4.2, Corollary 4.3, Theorem 4.4, Corollary 4.5). The proof in the composite hypothesis case is particularly nonstandard; see Section 4.4.