Bayesian Analysis

Equivalence between the Posterior Distribution of the Likelihood Ratio and a p-value in an Invariant Frame

Isabelle Smith and André Ferrari

Full-text: Open access


The Posterior distribution of the Likelihood Ratio (PLR) is proposed by Dempster in 1973 for significance testing in the simple vs. composite hypothesis case. In this hypothesis test case, classical frequentist and Bayesian hypothesis tests are irreconcilable, as emphasized by Lindley’s paradox, Berger & Selke in 1987 and many others. However, Dempster shows that the PLR (with inner threshold 1) is equal to the frequentist p-value in the simple Gaussian case. In 1997, Aitkin extends this result by adding a nuisance parameter and showing its asymptotic validity under more general distributions. Here we extend the reconciliation between the PLR and a frequentist p-value for a finite sample, through a framework analogous to the Stein’s theorem frame in which a credible (Bayesian) domain is equal to a confidence (frequentist) domain.

Article information

Bayesian Anal., Volume 9, Number 4 (2014), 939-962.

First available in Project Euclid: 21 November 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

hypothesis testing PLR p-value likelihood ratio frequentist and Bayesian reconciliation Lindley’s paradox invariance


Smith, Isabelle; Ferrari, André. Equivalence between the Posterior Distribution of the Likelihood Ratio and a p-value in an Invariant Frame. Bayesian Anal. 9 (2014), no. 4, 939--962. doi:10.1214/14-BA877.

Export citation


  • Aitkin, M. (1997). “The calibration of p-values, posterior Bayes factors and the AIC from the posterior distribution of the likelihood.” Statistics and Computing, 7: 253–261.
  • — (2010). Statistical inference: an integrated Bayesian / likelihood approach. Chapman and Hall.
  • Aitkin, M., Boys, R. J., and Chadwick, T. (2005). “Bayesian point null hypothesis testing via the posterior likelihood ratio.” Statistics and Computing, 25(3): 217–230.
  • Aitkin, M., Liu, C. C., and Chadwick, T. (2009). “Bayesian model comparison and model averaging for small-area estimation.” Annals of Applied Statistics, 3(1): 199–221.
  • Baskurt, Z. and Evans, M. (2013). “Hypothesis assessment and inequalities for Bayes factors and relative belief ratios.” Bayesian Analysis, 8,3: 569–590.
  • Berger, J. and Sellke, T. (1987). “Testing a point null hypothesis: the irreconcilability of P values and evidence (with discussion).” Journal of the American Statistical Association, 82: 112–139.
  • Berger, J. O. (1985). Statistical decision theory and Bayesian analysis. Springer-Verlag, 2nd edition.
  • Berger, J. O., Brown, L., and Wolpert, R. (1994). “A unified conditional frequentist and Bayesian test for fixed and sequential simple hypothesis testing.” Annals of Statistics, 22(4): 1787–1807.
  • Berger, J. O. and Delampady, M. (1987). “Testing precise hypotheses (with discussion).” Statistical Science, 2(3): 317–335.
  • Bernardo, J. (2011). Bayesian Statistics 9, chapter Integrated objective Bayesian estimation and hypothesis testing. Oxford University Press.
  • Birnbaum, A. (1962). “On the foundation of statistical inference (with discussion).” Journal of the American Statistical Association, 57(298): 269–326.
  • Borges, W. and Stern, J. (2007). “The rules of logic composition for the Bayesian epistemic e-values.” Logic journal of the IGPL, 15(5–6): 401–420.
  • Casella, G. and Berger, R. L. (1987). “Reconciling Bayesian and frequentist evidence in the one-sided testing problem.” Journal of the American Statistical Association, 82(397): 106–111.
  • Chang, T. and Villegas, C. (1986). “On a theorem of Stein relating Bayesian and classical inferences in group models.” The Canadian Journal of Statistics, 14(4): 289–296.
  • Darmois, G. (1935). “Sur les lois de probabilité à estimation exhaustive.” Compte-Rendu de l’Académie des Sciences de Paris, 200(1265–1266).
  • Dempster, A. P. (1973). “The direct use of likelihood for significance testing.” In Proceedings of Conference on Foundational Questions in Statistical Inference, 335–354. Aaarhus, Denmark.
  • — (1997). “Commentary on the paper by Murray Aitkin, and on discussion by Mervyn Stone.” Statistics and Computing, 7(4): 265–269.
  • Eaton, M. (1989). Group invariance applications in statistics. Regional Conference Series in Probability and Statistics.
  • — (2007). Multivariate statistics. Institute of Mathematical Statistics.
  • Eaton, M. and Sudderth, W. (1999). “Consistency and strong inconsistency of group-invariant predictive inferences.” Bernoulli, 5(5): 833–854.
  • — (2002). “Group invariant inference and right Haar measure.” Journal of Statistical Planning and Inference, 103(1–2): 87–99.
  • Evans, M. (1997). “Bayesian inference procedures derived via the concept of relative surprise.” Communications in Statistics, 26: 1125–1143.
  • Evans, M. and Jang, G. (2010). “Invariant p-values for model checking.” Annals of Statistics, 38(1): 512–525.
  • Fisher, R. A. (1973, 1st ed.: 1956). Statistical methods and scientific inference. Oliver and Boyd, 3rd edition.
  • Fraser, D. A. S. (1961). “The fiducial method and invariance.” Biometrika, 48(3–4): 261–280.
  • Goutis, C. and Casella, G. (1997). “Relationships between post-data accuracy measures.” Annals of the Institute of Statistical Mathematics, 49(4): 711–726.
  • Hwang, J., Casella, G., Robert, C., Wells, M., and Farrell, R. (1992). “Estimation of accuracy in testing.” Annals of Statistics, 20(1): 490–509.
  • Jeffreys, H. (1961). Theory of probability. Oxford University Press, 3rd edition.
  • Lehmann, E. L. and Romano, J. P. (2005). Testing statistical hypotheses. Springer, 3rd edition.
  • Lindley, D. (1957). “A statistical paradox.” Biometrika, 44(1–2): 187–192.
  • Meng, X.-L. (1994). “Posterior predictive p-values.” Annals of Statistics, 22(3): 1142–1160.
  • Nachbin, L. (1965). The Haar integral. Van Nostrand.
  • Newton, M. and Raftery, A. (1994). “Approximate Bayesian inference with the weighted likelihood bootstrap.” Journal of the Royal Statistical Society Series B, 56(1): 3–48.
  • Neyman, J. (1977). “Frequentist probability and frequentist statistics.” Synthese, 36: 97–131.
  • Neyman, J. and Pearson, E. (1933). “On the problem of the most efficient tests of statistical hypotheses.” Philosophical Transactions of the Royal Society of London, Series A, 231: 289–337.
  • Oh, H. and DasGupta, A. (1999). “Comparison of the p-value and posterior probability.” Journal of Statistical Planning and Inference, 76(1–2): 93–107.
  • O’Hagan, A. (1995). “Fractional Bayes factors for model comparison.” Journal of the Royal Statistical Society, 57(1): 99–138.
  • Pereira, C. and Stern, J. (1999). “Evidence and credibility: full Bayesian significance test for precise hypotheses.” Entropy, 1: 104–115.
  • Robert, C. P. (2007). The Bayesian choice. Springer, 2nd edition.
  • Robins, J., van der Vaart, A., and Ventura, V. (2000). “Asymptotic distribution of p-values in composite null models.” Journal of the American Statistical Association, 95(452): 1143–1156.
  • Royall, R. (1997). Statistical evidence: a likelihood paradigm. Chapman and Hall / CRC Press.
  • Sellke, T., Bayarri, M. J., and Berger, J. O. (2001). “Calibration of p-values for testing precise null hypotheses.” American Statistician, 55(1): 62—71.
  • Smith, I. (2010). “Détection d’une source faible : modèles et méthodes statistiques. Application à la détection d’exoplanètes par imagerie directe.” Ph.D. thesis, Université de Nice Sophia-Antipolis.
  • Smith, I. and Ferrari, A. (2010). “The posterior distribution of the likelihood ratio as a measure of evidence.” In International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 391–398.
  • Stein, C. (1965). “Approximation of improper prior measures by prior probability measures.” In Bernoulli, Bayes, Laplace Festschrift, 217–240. Springer-Verlag.
  • Tsao, C. A. (2006). “A note on Lindley’s paradox.” Test, 15(1): 125–139.
  • Villegas, C. (1981). “Inner statistical inference II.” Annals of Statistics, 9(4): 768–776.
  • Zidek, J. (1969). “A representation of Bayesian invariant procedures in terms of Haar measure.” Annals of the Institute of Statistical Mathematics, 21(1): 291–308.