Bayesian Analysis

A New Family of Non-Local Priors for Chain Event Graph Model Selection

Rodrigo A. Collazo and Jim Q. Smith

Full-text: Open access


Chain Event Graphs (CEGs) are a rich and provenly useful class of graphical models. The class contains discrete Bayesian Networks as a special case and is able to depict directly the asymmetric context-specific statements in the model. But bespoke efficient algorithms now need to be developed to search the enormous CEG model space. In different contexts Bayes Factor scored search algorithm using non-local priors (NLPs) has recently proved very successful for searching other huge model spaces. Here we define and explore three different types of NLP that we customise to search CEG spaces. We demonstrate how one of these candidate NLPs provides a framework for search which is both robust and computationally efficient. It also avoids selecting an overfitting model as the standard conjugate methods sometimes do. We illustrate the efficacy of our methods with two examples. First we analyse a previously well-studied 5-year longitudinal study of childhood hospitalisation. The second much larger example selects between competing models of prisoners’ radicalisation in British prisons: because of its size an application beyond the scope of earlier Bayes Factor search algorithms.

Article information

Bayesian Anal., Volume 11, Number 4 (2016), 1165-1201.

First available in Project Euclid: 30 November 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

chain event graph Bayesian model selection non-local prior moment prior discrete Bayesian networks asymmetric discrete models Bayes factor search


Collazo, Rodrigo A.; Smith, Jim Q. A New Family of Non-Local Priors for Chain Event Graph Model Selection. Bayesian Anal. 11 (2016), no. 4, 1165--1201. doi:10.1214/15-BA981.

Export citation


  • Abramowitz, M. and Stegun, I. A. (1972). “Handbook of mathematical functions with formulas, graphs, and mathematical tables.” National Bureau of Standards Applied Mathematics Series, 55, 10th printing (with corrections).
  • Altomare, D., Consonni, G., and La Rocca, L. (2013). “Objective Bayesian Search of Gaussian Directed Acyclic Graphical Models for Ordered Variables with Non-Local Priors.” Biometrics, 69(2): 478–487.
  • Bangsø, O. and Wuillemin, P.-H. (2000). Top-Down Construction and Repetitive Structures Representation in Bayesian Networks, 282–286. AAAI Press.
  • Barclay, L. M., Collazo, R. A., Smith, J. Q., Thwaites, P., and Nicholson, A. (2015). “The dynamic chain event graph.” Electronic Journal of Statistics, 9(2): 2130–2169.
  • Barclay, L. M., Hutton, J. L., and Smith, J. Q. (2013). “Refining a Bayesian Network using a Chain Event Graph.” International Journal of Approximate Reasoning, 54(9): 1300–1309.
  • Billingsley, P. (1999). Convergence of Probability Measures. Wiley Series in Probability and Statistics. New York; Chichester: Wiley, 2nd edition.
  • Boutilier, C., Friedman, N., Goldszmidt, M., and Koller, D. (1996). “Context-specific independence in Bayesian networks.” In: Horvitz, E. and Jensen, F. (eds.), 12th Conference on Uncertainty in Artificial Intelligence (UAI’96), Uncertainty in Artificial Intelligence, 115–123. San Francisco: Morgan Kaufmann Publishers Inc.
  • Bozga, M. and Maler, O. (1999). “On the Representation of Probabilities over Structured Domains.” In: Halbwachs, N. and Peled, D. (eds.), Computer Aided Verification, volume 1633 of Lecture Notes in Computer Science, 261–273. Springer Berlin Heidelberg.
  • Collazo, R. A. and Smith, J. Q. (2015). Supplement to “A New Family of Non-Local Priors for Chain Event Graph Model Selection.” Bayesian Analysis.
  • Consonni, G., Forster, J. J., and La Rocca, L. (2013). “The Whetstone and the Alum Block: Balanced Objective Bayesian Comparison of Nested Models for Discrete Data.” Statistical Science 28(3): 398–423.
  • Consonni, G. and La Rocca, L. (2011). “On Moment Priors for Bayesian Model Choice with Applications to Directed Acyclic Graphs.” In: Bernardo, J. M., Bayarri, M. J., Berger, J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M., and West, M. (eds.), Bayesian Statistics 9– Proceedings of the ninth Valencia international meeting, 63–78. Oxford University Press.
  • Cowell, R. G., Dawid, A. P., Lauritzen, S. L., and Spiegelhalter, D. J. (2007). Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Science. New York; London: Springer.
  • Cowell, R. G. and Smith, J. Q. (2014). “Causal discovery through MAP selection of stratified chain event graphs.” Electronic Journal of Statistics, 8(1): 965–997.
  • Cussens, J. (2008). “Bayesian network learning by compiling to weighted MAX-SAT.” In: UAI 2008, Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence, Helsinki, Finland, July 9–12, 2008, 105–112.
  • Cuthbertson, I. M. (2004). “Prisons and the Education of Terrorists.” World Policy Journal, 21(3): pp. 15–22.
  • DasGupta, A. (2008). Asymptotic Theory of Statistics and Probability. Springer Texts in Statistics. Springer.
  • Dawid, A. (2011). “Posterior Model Probabilities.” In: Bandyopadhyay, P. S. and Forster, M. R. (eds.), Philosophy of Statistics, volume 7, 607–630. Amsterdam: North-Holland.
  • Dawid, A. P. (1999). “The Trouble with Bayes Factors.” Technical report, University College London.
  • Fergusson, D. M., Horwood, L. J., and Shannon, F. T. (1986). “Social and Family Factors in Childhood Hospital Admission.” Journal of Epidemiology and Community Health, 40(1): 50–58.
  • Freeman, G. and Smith, J. Q. (2011). “Bayesian MAP model selection of chain event graphs.” Journal of Multivariate Analysis, 102(7): 1152–1165.
  • Hannah, G., Clutterbuck, L., and Rubin, J. (2008). “Radicalization or Rehabilitation. Understanding the challenge of extremist and radicalized prisoners.” Technical Report TR 571, RAND Corporation.
  • Heard, N. A., Holmes, C. C., and Stephens, D. A. (2006). “A quantitative study of gene regulation involved in the immune response of anopheline mosquitoes: An application of Bayesian hierarchical clustering of curves.” Journal of the American Statistical Association, 101(473): 18–29.
  • Heckerman, D. (1999). “Learning in Graphical Models.” chapter A Tutorial on Learning with Bayesian Networks, 301–354. Cambridge, MA, USA: MIT Press.
  • Jaeger, M. (2004). “Probabilistic decision graphs – Combining verification and AI techniques for probabilistic inference.” International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 12: 19–42.
  • Jaeger, M., Nielsen, J. D., and Silander, T. (2006). “Learning probabilistic decision graphs.” International Journal of Approximate Reasoning, 42(1–2): 84–100.
  • Johnson, V. E. and Rossell, D. (2010). “On the use of non-local prior densities in Bayesian hypothesis tests.” Journal of the Royal Statistical Society Series B – Statistical Methodology, 72: 143–170.
  • Johnson, V. E. and Rossell, D. (2012). “Bayesian Model Selection in High-Dimensional Settings (vol. 107pg. 649, 2012).” Journal of the American Statistical Association, 107(500): 1656–1656.
  • Jordan, J. and Horsburgh, N. (2006). “Spain and Islamist Terrorism: Analysis of the Threat and Response 1995-2005.” Mediterranean Politics, 11(2): 209–229.
  • Koller, D. and Pfeffer, A. (1997). “Object-oriented Bayesian Networks.” In Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, UAI’97, 302–313. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
  • Korb, K. B. and Nicholson, A. E. (2011). Bayesian Artificial Intelligence. Chapman and Hall/CRC Computer Science and Data Analysis Series. Boca Raton, FL: CRC Press, 2nd edition.
  • Kruskal, J. (1964). “Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis.” Psychometrika, 29(1): 1–27.
  • McAllester, D., Collins, M., and Pereira, F. (2008). “Case-factor diagrams for structured probabilistic modeling.” Journal of Computer and System Sciences, 74(1): 84–96.
  • Ministry of Justice (2013). “Annual tables – Offender management caseload statistics 2012 tables.” Online; accessed 03-Nov-2014. URL:–2
  • Neapolitan, R. E. (2004). Learning Bayesian Networks. Harlow: Prentice Hall.
  • Neumann, P. E. (2010). “Prisons and Terrorism: Radicalisation and De-radicalisation in 15 Countries.” Technical report, International Centre for the Study of Radicalisation and Political Violence, London.
  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference. Cambridge: Cambridge University Press.
  • Poole, D. and Zhang, N. L. W. (2003). “Exploiting contextual independence in probabilistic inference.” Journal of Artificial Intelligence Research, 18: 263–313.
  • Rao, C. R. (1995). “A review of canonical coordinates and an alternative to correspondence analysis using Hellinger distance.” Questiio, 19(1–3): 23–63.
  • Rowe, R. (2014). “From jail to jihad? The threat of prison radicalisation.” Online; published 12-May-2014, accessed 19-Jan-2015. URL:
  • Schervish, M. (1996). Theory of Statistics. Springer Series in Statistics. Springer New York.
  • Scutari, M. (2013). “On the Prior and Posterior Distributions Used in Graphical Modelling.” Bayesian Analysis, 8(3): 505–532.
  • Silander, T. and Leong, T.-Y. (2013). “A Dynamic Programming Algorithm for Learning Chain Event Graphs.” In Fürnkranz, J., Hüllermeier, E., and Higuchi, T. (eds.), Discovery Science, volume 8140 of Lecture Notes in Computer Science, 201–216. Springer Berlin Heidelberg.
  • Silke, A. (2011). The Psychology of Counter-Terrorism. Cass Series on Political Violence. Abingdon, Oxon, England; New York: Routledge.
  • Smith, J. Q. (2010). Bayesian Decision Analysis: Principles and Practice. Cambridge; New York: Cambridge University Press.
  • Smith, J. Q. and Anderson, P. E. (2008). “Conditional independence and chain event graphs.” Artificial Intelligence, 172(1): 42–68.
  • Thwaites, P. A., Smith, J. Q., and Cowell, R. G. (2008). “Propagation using Chain Event Graphs.” In Proceedings of the Twenty-Fourth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-08), 546–553. Corvallis, Oregon: AUAI Press.

Supplemental materials

  • Pairwise Non-Local Priors for CEG Model Selection: Supplementary Material. The supplementary document includes the normalisation constants of pm-NLPs using Hellinger distance and its extension to ρ-norm space (ρ∈ℕ+, the computational results for all simulations presented here (Section 4) using the Hellinger pm-NLPs, and all CEG models found in Section 4.2 by the AHC algorithm using local and non-local priors.