Bayesian Analysis

Posterior Belief Assessment: Extracting Meaningful Subjective Judgements from Bayesian Analyses with Complex Statistical Models

Daniel Williamson and Michael Goldstein

Full-text: Open access


In this paper, we are concerned with attributing meaning to the results of a Bayesian analysis for a problem which is sufficiently complex that we are unable to assert a precise correspondence between the expert probabilistic judgements of the analyst and the particular forms chosen for the prior specification and the likelihood for the analysis. In order to do this, we propose performing a finite collection of additional Bayesian analyses under alternative collections of prior and likelihood modelling judgements that we may also view as representative of our prior knowledge and the problem structure, and use these to compute posterior belief assessments for key quantities of interest. We show that these assessments are closer to our true underlying beliefs than the original Bayesian analysis and use the temporal sure preference principle to establish a probabilistic relationship between our true posterior judgements, our posterior belief assessment and our original Bayesian analysis to make this precise. We exploit second order exchangeability in order to generalise our approach to situations where there are infinitely many alternative Bayesian analyses we might consider as informative for our true judgements so that the method remains tractable even in these cases. We argue that posterior belief assessment is a tractable and powerful alternative to robust Bayesian analysis. We describe a methodology for computing posterior belief assessments in even the most complex of statistical models and illustrate with an example of calibrating an expensive ocean model in order to quantify uncertainty about global mean temperature in the real ocean.

Article information

Bayesian Anal., Volume 10, Number 4 (2015), 877-908.

First available in Project Euclid: 31 August 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

prevision subjective Bayes temporal sure preference Bayesian analysis MCMC


Williamson, Daniel; Goldstein, Michael. Posterior Belief Assessment: Extracting Meaningful Subjective Judgements from Bayesian Analyses with Complex Statistical Models. Bayesian Anal. 10 (2015), no. 4, 877--908. doi:10.1214/15-BA966SI.

Export citation


  • Andiranakis, I. and Challenor, P. G. (2012), The effect of the nugget on Gaussian process emulators of computer models. Computational Statistics and Data Analysis, 56: 4215–4228.
  • Bayarri, M. J., Berger, J. O., Cafeo, J., Garcia-Donato, G., Liu, F., Palomo, J., Parthasarathy, R. J., Paulo, R., Sacks, J., and Walsh, D. (2007), Computer model validation with functional output, The Annals of Statistics, 35, 1874–1906.
  • Berger, J. O. (1994), An overview of robust Bayesian analysis, Test, 3(1), 5–124.
  • Berger, J. O. (2006), The case for objective Bayesian analysis. Bayesian Analysis, 1(3), 385–402.
  • Craig, P. S., Goldstein, M., Rougier, J. C., and Seheult, A. H. (2001), Bayesian forecasting for complex systems using computer simulators, Journal of the American Statistical Association, 96, 717–729.
  • Cumming, J. A. and Goldstein, M. (2009), Small sample designs for complex high-dimensional models based on fast approximations, Technometrics, 51, 377–388.
  • de Finetti, B. (1974), Theory of Probability, Volume 1, Wiley, New York.
  • de Finetti, B. (1975), Theory of Probability, Volume 2, Wiley, New York.
  • Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2004), Bayesian Data Analysis, second edition, Chapman and Hall/CRC Texts in Statistical Science.
  • Goldstein, M. (1997). Prior inferences for posterior judgements. In: Structures and Norms in Science, M. C. D. Chiara et al., eds., Kluwer, 55–71.
  • Goldstein, M. (2006), Subjective Bayesian analysis: Principles and practice. Bayesian Analysis, 1(3), 403–420.
  • Goldstein, M. (2011). External Bayesian analysis for computer simulators. In: Bayesian Statistics 9. Bernado, J. M. et al., eds., Oxford University Press.
  • Goldstein, M. (2012). Observables and models: exchangeability and the inductive argument. In: Bayesian Theory and Applications, Damien, P. et al., eds., Clarendon Press, Oxford, 3–18.
  • Goldstein, M. and Wooff, D. (2007), Bayes Linear Statistics Theory and Methods, John Wiley and Sons Ltd.
  • Haylock, R. and O’Hagan, A. (1996), “On inference for outputs of computationally expensive algorithms with uncertainty on the inputs.” In: Bayesian Statistics 5, Bernado, J. M., Berger, J. O., Dawid, A. P., and Smith, A. F. M., eds., Oxford University Press, 629–637.
  • Higdon, D., Nakhleh, C., Gattiker, J., and Williams, B. (2008), “A Bayesian calibration approach to the thermal problem”, Computer Methods in Applied Mechanics and Engineering, 197, 2431–2441.
  • Hoeting, J. A., Madigan, D., Raftery, A. E., and Volinsky, V. T. (1999), Bayesian model averaging: a tutorial, Statistical Science, 14(4), 382–401.
  • Ingleby, B. and Huddleston, M. (2007), Quality control of ocean temperature and salinity profiles – Historical and real time data, Journal of Marine Systems, 65 158–175.
  • Kaufman, C. G., Bingham, D., Habib, S., Heitmann, K., and Frieman, J. A. (2011), Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology, The Annals of Applied Statistics, 5(4), 2470–2492.
  • Kennedy, M. C. and O’Hagan, A. (2001), Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63: 425–464.
  • Lee, L., Pringle, K., Reddington, C., Mann, G., Stier, P., Spracklen, D., Pierce, J., and Carslaw, K. (2013), The magnitude and causes of uncertainty in global model simulations of cloud condensation nuclei, Atmospheric Chemistry and Physics, 13.17, 8879–8914.
  • Liang, F., Liu, C., and Carroll, R. J. (2010), Advanced Markov Chain Monte Carlo Methods, John Wiley and Sons Ltd., Chichester, UK.
  • Lindley, D. V. (2000), The philosophy of statistics. Journal of the Royal Statistical Society: Series D (The Statistician), 49(3), 293–337.
  • Madec, G. (2008), NEMO ocean engine. Note du Pole de Modélisation. Institut Pierre-Simon Laplace (IPSL), France, No. 27, ISSN No. 1288-1619.
  • Morris, D. E., Oakley, J. E., and Crowe, J. A. (2014), A web-based tool for eliciting probability distributions from experts. Environmental Modelling and Software, 52: 1–4, ISSN 1364-8152,
  • Oakley, J. E. and O’Hagan, A. (2010). SHELF: the Sheffield Elicitation Framework (Version 2.0), School of Mathematics and Statistics, University of Sheffield (2010),
  • Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H. P. (1989), Design and analysis of computer experiments, Statistical Science, 4, 409–435.
  • Santner, T. J., Williams, B. J., and Notz, W. I. (2003), The Design and Analysis of Computer Experiments, Springer-Verlag New York.
  • Savage, L. J. (1977) The shifting foundations of statistics. In: Logic, Laws and Life: Some Philosophical Complications, R.G. Colodny, ed., Pittsburgh University Press, Pittsburg, 3–18.
  • Sexton, D. M. H., Murphy, J. M., and Collins, M. (2011), Multivariate probabilistic projections using imperfect climate models part 1: outline of methodology, Climate Dynamics,
  • Vernon, I., Goldstein, M., and Bower, R. G. (2010), “Galaxy formation: a Bayesian uncertainty analysis,” Bayesian Analysis, 5(4), 619–846, with discussion.
  • Williamson, D. and Blaker, A. T. (2014), Evolving Bayesian emulators for structurally chaotic time series with application to large climate models. SIAM/ASA Journal on Uncertainty Quantification, 2(1) 1–28.
  • Williamson, D., Goldstein, M., and Blaker, A. (2012), “Fast linked analyses for scenario based hierarchies,” Journal of the Royal Statistical Society: Series C (Applied Statistics), 61(5), 665–692.
  • Williamson, D., Goldstein, M., Allison, L., Blaker, A., Challenor, P. Jackson, L., and Yamazaki, K. (2013), History matching for exploring and reducing climate model parameter space using observations and a large perturbed physics ensemble. Climate Dynamics 41: 1703–1729.
  • Williamson, D. (2015). Exploratory ensemble designs for environmental models using $k$-extended Latin hypercubes Environmetrics, 26(4) 268–283.
  • Williamson, D., Blaker, A. T., Hampton, C., and Salter, J. (2014). Identifying and removing structural biases in climate models with history matching. Climate Dynamics, Online First,
  • Williamson, D., Blaker, A. T., and Sinha, B. (2015) Statistical ocean model tuning and parametric uncertainty quantification with NEMO, In preparation.