Bayesian Analysis

ABC likelihood-free methods for model choice in Gibbs random fields

Aude Grelaud, Jean-Michel Marin, Christian P. Robert, François Rodolphe, and Jean-François Taly

Full-text: Open access

Abstract

Gibbs random fields (GRF) are polymorphous statistical models that can be used to analyse different types of dependence, in particular for spatially correlated data. However, when those models are faced with the challenge of selecting a dependence structure from many, the use of standard model choice methods is hampered by the unavailability of the normalising constant in the Gibbs likelihood. In particular, from a Bayesian perspective, the computation of the posterior probabilities of the models under competition requires special likelihood-free simulation techniques like the Approximate Bayesian Computation (ABC) algorithm that is intensively used in population genetics. We show in this paper how to implement an ABC algorithm geared towards model choice in the general setting of Gibbs random fields, demonstrating in particular that there exists a sufficient statistic across models. The accuracy of the approximation to the posterior probabilities can be further improved by importance sampling on the distribution of the models. The practical aspects of the method are detailed through two applications, the test of an iid Bernoulli model versus a first-order Markov chain, and the choice of a folding structure for two proteins.

Article information

Source
Bayesian Anal., Volume 4, Number 2 (2009), 317-335.

Dates
First available in Project Euclid: 22 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1340370280

Digital Object Identifier
doi:10.1214/09-BA412

Mathematical Reviews number (MathSciNet)
MR2507366

Zentralblatt MATH identifier
1330.62126

Keywords
Approximate Bayesian Computation model choice Gibbs Random Fields Bayes factor protein folding

Citation

Grelaud, Aude; Robert, Christian P.; Marin, Jean-Michel; Rodolphe, François; Taly, Jean-François. ABC likelihood-free methods for model choice in Gibbs random fields. Bayesian Anal. 4 (2009), no. 2, 317--335. doi:10.1214/09-BA412. https://projecteuclid.org/euclid.ba/1340370280


Export citation

References

  • Beaumont, M., W. Zhang, and D. Balding. 2002. Approximate Bayesian computation in population genetics. Genetics 162:2025–2035.
  • Besag, J. 1974. Spatial interaction and the statistical analysis of lattice systems. J. Royal Statist. Society, Series B 36:192–236.
  • Besag, J. 1975. Statistical analysis of a non-lattice data. The Statistician 24(3):179–195.
  • Blum, M. G. B. and O. François. 2008. Highly tolerant likelihood-free Bayesian inference: An adaptative non-linear heteroscedastic model. Statistics and Computing (to appear).
  • Darroch, J.N., S.L. Lauritzen, and T.P. Speed. 1980. Markov fields and log-linear interaction model for contingency tables. Annals of Statistics 8(3):522–539.
  • Gelman, A. and X. Meng. 1998. Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statist. Science 13:163–185.
  • Green, P. and S. Richardson. 2002. Hidden Markov models and disease mapping. J. American Statist. Assoc. 92:1055–1070.
  • Häggström, O. (2002). Finite Markov Chains and Algorithmic Applications, volume 52, Student Texts. London Mathematical Society.
  • Ibanez, M. and A. Simo. 2003. Parametric estimation in Markov random fields image modeling with imperfect observations. A comparative study. Pattern Recognition Letters 24:2377–2389.
  • Marin, A., J. Pothier, K. Zimmermann, and J. Gibrat. 2002. FROST: a filterbased fold recognition method. Proteins 49:493–509.
  • Marjoram, P., J. Molitor, V. Plagnol, and S. Tavaré. 2003. Markov chain Monte Carlo without likelihoods. Proc. National Acad. Sci. USA 100(26): 15324–15328.
  • Martin, J., G. Letellier, A. Marin, J.-F. Taly, A. de Brevern, and J.-F. Gibrat. 2005. Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct. Biol. 5:17.
  • Møller, J. 2003. Spatial Statistics and Computational Methods, volume 173 of Lecture Notes in Statistics. Springer-Verlag, New York.
  • Møller, J., A. Pettitt, R. Reeves, and K. Berthelsen. 2006. An efficient MCMC algorithm method for distributions with intractable normalising constant. Biometrika 93:451–458.
  • Møller, J. and R. Waagepetersen. 2003. Statistical Inference and Simulation for Spatial Point Processes. Chapman and Hall/CRC, Boca Raton, FL.
  • Pritchard, J. K., M. T. Seielstad, A. Perez-Lezaun, and M. W. Feldman. 1999. Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol. Biol. Evol. 16: 1791–1798.
  • Robert, C.P. and G. Casella. 2004. Monte Carlo Statistical Methods. Springer-Verlag, New York.
  • Rue, H. and L. Held. 2005. Gaussian Random Fields: Theory and Applications. Chapman and Hall/CRC, Boca Raton, FL.
  • Sali, A. and T. Blundell. 1993. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234:779–815.
  • Taly, J., A. Marin, and J. Gibrat. 2008. Can molecular dynamics simulations help in discriminating correct from erroneous protein 3D models? BMC Bioinformatics 9:6.
  • Toni, T., D. Welch, N. Strelkowa, A. Ipsen and M.P. Stumpf. 2008. Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. Royal Society Interface 6:187–202.
  • Wilkinson, R. D. 2008. Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. arXiv:0811.3355
  • Zhang, Y. and J. Skolnick. 2004. Scoring function for automated assessment of protein structure template quality. Proteins 57:702–710.