Bayesian Analysis

On the Geometry of Bayesian Inference

Miguel de Carvalho, Garritt L. Page, and Bradley J. Barney

Full-text: Open access


We provide a geometric interpretation to Bayesian inference that allows us to introduce a natural measure of the level of agreement between priors, likelihoods, and posteriors. The starting point for the construction of our geometry is the observation that the marginal likelihood can be regarded as an inner product between the prior and the likelihood. A key concept in our geometry is that of compatibility, a measure which is based on the same construction principles as Pearson correlation, but which can be used to assess how much the prior agrees with the likelihood, to gauge the sensitivity of the posterior to the prior, and to quantify the coherency of the opinions of two experts. Estimators for all the quantities involved in our geometric setup are discussed, which can be directly computed from the posterior simulation output. Some examples are used to illustrate our methods, including data related to on-the-job drug usage, midge wing length, and prostate cancer.

Article information

Bayesian Anal., Volume 14, Number 4 (2019), 1013-1036.

First available in Project Euclid: 10 August 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian inference geometry Hellinger affinity Hilbert space marginal likelihood

Creative Commons Attribution 4.0 International License.


de Carvalho, Miguel; Page, Garritt L.; Barney, Bradley J. On the Geometry of Bayesian Inference. Bayesian Anal. 14 (2019), no. 4, 1013--1036. doi:10.1214/18-BA1112.

Export citation


  • Agarawal, A. and Daumé, III, H. (2010). “A geometric view of conjugate priors.” Machine Learning 81, 99–113.
  • Aitchison, J. (1971). “A geometrical version of Bayes’ theorem.” The American Statistician 25, 45–46.
  • Al Labadi, L. and Evans, M. (2016). “Optimal robustness results for relative belief inferences and the relationship to prior–data conflict.” Bayesian Analysis 12, 705–728.
  • Amari, S.-i. (2016). Information Geometry and its Applications. New York: Springer.
  • Anaya-Izquierdo, K. and Marriott, P. (2007). “Local mixtures of the exponential distribution.” Annals of the Institute of Statistical Mathematics 59 111–134.
  • Berger, J. (1991). “Robust Bayesian analysis: Sensitivity to the prior.” Journal of Statistical Planning and Inference 25, 303–328.
  • Berger, J. and Berliner, L. M. (1986). “Robust Bayes and empirical Bayes analysis with $\varepsilon$-contaminated priors.” Annals of Statistics 14, 461–486.
  • Berger, J. O. and Wolpert, R. L. (1988). The Likelihood Principle. In IMS Lecture Notes, Ed. Gupta, S. S., Institute of Mathematical Statistics, vol. 6.
  • Birnbaum, Z. W. (1948). “On random variables with comparable peakedness.” Annals of Mathematical Statistics 19 76–81.
  • Christensen, R., Johnson, W. O., Branscum, A. J. and Hanson, T. E. (2011). Bayesian Ideas and Data Analysis. Boca Raton: CRC Press.
  • Cheney, W. (2001). Analysis for Applied Mathematics. New York: Springer.
  • de Carvalho, M., Page, G. L., and Barney, B. J. (2018). “Supplementary Material to “On the Geometry of Bayesian Inference”.” Bayesian Analysis.
  • Diaconis, P. and Ylvisaker, D. (1979). “Conjugate priors for exponential families,” Annals of Statistics 7 269–281.
  • Evans, M. and Jang, G. H. (2011). “Weak informativity and the information in one prior relative to another.” Statistical Science 26, 423–439.
  • Evans, M. and Moshonov, H. (2006). “Checking for prior–data conflict.” Bayesian Analysis 1, 893–914.
  • Gelman, A., Jakulin, A., Pittau, M. G. and Su, Y. S. (2008). “A weakly informative default prior distribution for logistic and other regression models.” Annals of Applied Statistics 2, 1360–1383.
  • Gutiérrez-Peña, E. and Smith, A. F. M. (1995). “Conjugate parametrizations for natural exponential families.” Journal of the American Statistical Association 90, 1347–1356.
  • Giné, E. and Nickl, R. (2008). “A simple adaptive estimator of the integrated square of a density.” Bernoulli 14, 47–61.
  • Hartigan, J. A. (1998). “The maximum likelihood prior.” Annals of Statistics 26 2083–2103.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2008). Elements of Statistical Learning. New York: Springer.
  • Hoff, P. (2009). A First Course in Bayesian Statistical Methods. New York: Springer.
  • Hunter, J. and Nachtergaele, B. (2005). Applied Analysis. London: World Scientific Publishing.
  • Kurtek, S. and Bharath, K. (2015). “Bayesian sensitivity analysis with the Fisher–Rao metric.” Biometrika 102, 601–616.
  • Kyung, M., Gill, J., Ghosh, M. and Casella, G. (2010). “Penalized regression, standard errors and Bayesian lassos.” Bayesian Analysis 5, 369–412.
  • Lavine, M. (1991). “Sensitivity in Bayesian statistics: The prior and the likelihood.” Journal of the American Statistical Association 86 396–399.
  • Lopes, H. F. and Tobias, J. L. (2011). “Confronting prior convictions: On issues of prior sensitivity and likelihood robustness in Bayesian analysis.” Annual Review of Economics 3, 107–131.
  • Marriott, P. (2002). “On the local geometry of mixture models.” Biometrika 89 77–93.
  • Millman, R. S. and Parker, G. D. (1991). Geometry: A Metric Approach with Models. New York: Springer.
  • Newton, M. A. and Raftery, A. E. (1994). “Approximate Bayesian inference with the weighted likelihood Bootstrap (With Discussion).” Journal of the Royal Statistical Society, Series B, 56, 3–26.
  • Park, T. and Casella, G. (2008). “The Bayesian lasso.” Journal of the American Statistical Association 103, 681–686.
  • Raftery, A. E., Newton, M. A., Satagopan, J. M. and Krivitsky, P. N. (2007). “Estimating the integrated likelihood via posterior simulation using the harmonic mean identity.” In Bayesian Statistics, Eds. Bernardo, J. M., Bayarri, M. J., Berger, J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M. and West, M., Oxford University Press, vol. 8.
  • Roos, M. and Held, L. (2011). “Sensitivity analysis for Bayesian hierarchical models.” Bayesian Analysis 6, 259–278.
  • Roos, M., Martins T. G., Held, L. and Rue, H. (2015). “Sensitivity analysis for Bayesian hierarchical models.” Bayesian Analysis 10, 321–349.
  • Slobodchikoff, C. N. and Schulz, W. C. (1980). “Measures of niche overlap.” Ecology 61 1051–1055.
  • Scheel, I., Green, P. J. and Rougier, J. C. (2011). “A graphical diagnostic for identifying influential model choices in Bayesian hierarchical models.” Scandinavian Journal of Statistics 38, 529–550.
  • Shortle, J. F. and Mendel, M. B. (1996). “The geometry of Bayesian inference.” In Bayesian Statistics. eds. Bernardo, J. M., Berger, J. O., Dawid, A. P. and Smith, A. F. M., Oxford University Press, vol. 5, pp. 739–746.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge: Cambridge University Press.
  • Walter, G. and Augustin, T. (2009). “Imprecision and prior-data conflict in generalized Bayesian inference.” Journal of Statistical Theory and Practice 3, 255–271.
  • Wolpert, R. and Schmidler, S. (2012). “$\alpha$-stable limit laws for harmonic mean estimators of marginal likelihoods.” Statistica Sinica 22, 655–679.
  • Zhu, H., Ibrahim, J. G. and Tang, N. (2011). “Bayesian influence analysis: A geometric approach.” Biometrika 98, 307–323.

Supplemental materials

  • Supplementary Material to “On the Geometry of Bayesian Inference”. The online supplementary materials include the counterparts of the data examples in the paper for the case of affine-compatibility as introduced in Section 3.2, technical derivations, and proofs of propositions.