Source: Ann. Statist. Volume 37, Number 2
(2009), 905-938.
Reference analysis produces objective Bayesian inference, in the sense that inferential statements depend only on the assumed model and the available data, and the prior distribution used to make an inference is least informative in a certain information-theoretic sense. Reference priors have been rigorously defined in specific contexts and heuristically defined in general, but a rigorous general definition has been lacking. We produce a rigorous general definition here and then show how an explicit expression for the reference prior can be obtained under very weak regularity conditions. The explicit expression can be used to derive new reference priors both analytically and numerically.
References
[1] Ayyangar, A. S. K. (1941). The triangular distribution. Math. Students 9 85–87.
Mathematical Reviews (MathSciNet):
MR5557
[2] Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis, 2nd ed. Springer, Berlin.
Mathematical Reviews (MathSciNet):
MR804611
[3] Berger, J. O. (2006). The case for objective Bayesian analysis (with discussion). Bayesian Anal. 1 385–402 and 457–464.
[4] Berger, J. O. and Bernardo, J. M. (1989). Estimating a product of means: Bayesian analysis with reference priors. J. Amer. Statist. Assoc. 84 200–207.
Mathematical Reviews (MathSciNet):
MR999679
[5] Berger, J. O. and Bernardo, J. M. (1992a). Ordered group reference priors with applications to a multinomial problem. Biometrica 79 25–37.
[6] Berger, J. O. and Bernardo, J. M. (1992b). Reference priors in a variance components problem. In Bayesian Analysis in Statistics and Econometrics (P. K. Goel and N. S. Iyengar, eds.) 323–340. Springer, Berlin.
[7] Berger, J. O. and Bernardo, J. M. (1992c). On the development of reference priors (with discussion). In Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 35–60. Oxford Univ. Press.
[8] Berger, J. O., de Oliveira, V. and Sansó, B. (2001). Objective Bayesian analysis of spatially correlated data. J. Amer. Statist. Assoc. 96 1361–1374.
[9] Berger, J. O. and Yang, R. (1994). Noninformative priors and Bayesian testing for the AR(1) model. Econometric Theory 10 461–482.
[10] Bernardo, J. M. (1979). Reference posterior distributions for Bayesian inference (with discussion). J. Roy. Statist. Soc. Ser. B 41 113–147. [Reprinted in Bayesian Inference (N. G. Polson and G. C. Tiao, eds.) 229–263. Edward Elgar, Brookfield, VT, 1995.]
Mathematical Reviews (MathSciNet):
MR547240
[11] Bernardo, J. M. (2005). Reference analysis. In Handbook of Statistics 25 (D. K. Dey and C. R. Rao, eds.) 17–90. North-Holland, Amsterdam.
[12] Bernardo, J. M. and Rueda, R. (2002). Bayesian hypothesis testing: A reference approach. Internat. Statist. Rev. 70 351–372.
[13] Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, Chichester.
[14] Boros, G. and Moll, V. (2004). The Psi function. In Irresistible Integrals: Symbolics, Analysis and Experiments in the Evaluation of Integral 212–215. Cambridge Univ. Press.
[15] Chernoff, H. (1956). Large-sample theory: Parametric case. Ann. Math. Statist. 27 1–22.
Mathematical Reviews (MathSciNet):
MR76245
[16] Clarke, B. (1999). Asymptotic normality of the posterior in relative entropy. IEEE Trans. Inform. Theory 45 165–176.
[17] Clarke, B. and Barron, A. R. (1994). Jeffreys’ prior is asymptotically least favorable under entropy risk. J. Statist. Plann. Inference 41 37–60.
[18] Csiszar, I. (1967). Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar. 2 299–318.
Mathematical Reviews (MathSciNet):
MR219345
[19] Csiszar, I. (1975). I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3 146–158.
Mathematical Reviews (MathSciNet):
MR365798
[20] Datta, G. S. and Mukerjee, R. (2004). Probability Matching Priors: Higher Order Asymptotics. Springer, New York.
[21] Fraser, D. A. S., Monette, G. and Ng, K. W. (1985). Marginalization, likelihood and structural models. In Multivariate Analysis 6 (P. R. Krishnaiah, ed.) 209–217. North-Holland, Amsterdam.
Mathematical Reviews (MathSciNet):
MR822296
[22] Gibbs, J. W. (1902). Elementary Principles in Statistical Mechanics. Constable, London. Reprinted by Dover, New York, 1960.
Mathematical Reviews (MathSciNet):
MR116523
[23] Ghosh, J. K., Delampady, M. and Samanta, T. (2006). An Introduction to Bayesian Analysis: Theory and Methods. Springer, New York.
[24] Good, I. J. (1950). Probability and the Weighing of Evidence. Hafner Press, New York.
Mathematical Reviews (MathSciNet):
MR41366
[25] Good, I. J. (1969). What is the use of a distribution? In Multivariate Analysis 2 (P. R. Krishnaiah, ed.) 183–203. North-Holland, Amsterdam.
Mathematical Reviews (MathSciNet):
MR260075
[26] Ghosal, S. (1997). Reference priors in multiparameter nonregular cases. Test 6 159–186.
[27] Ghosal, S. and Samanta, T. (1997). Expansion of Bayes risk for entropy loss and reference prior in nonregular cases. Statist. Decisions 15 129–140.
[28] Heath, D. L. and Sudderth, W. D. (1989). Coherent inference from improper priors and from finitely additive priors. Ann. Statist. 17 907–919.
Mathematical Reviews (MathSciNet):
MR994275
[29] Jaynes, E. T. (1957). Information theory and statistical mechanics. Phys. Rev. 106 620–630.
Mathematical Reviews (MathSciNet):
MR87305
[30] Jaynes, E. T. (1968). Prior probabilities. IEEE Trans. Systems, Science and Cybernetics 4 227–291.
[31] Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems. Proc. Roy. Soc. London Ser. A 186 453–461.
Mathematical Reviews (MathSciNet):
MR17504
[32] Jeffreys, H. (1961). Theory of Probability, 3rd ed. Oxford Univ. Press.
Mathematical Reviews (MathSciNet):
MR187257
[33] Johnson, N. J. and Kotz, S. (1999). Nonsmooth sailing or triangular distributions revisited after some 50 years. The Statistician 48 179–187.
[34] Kullback, S. (1959). Information Theory and Statistics, 2nd ed. Dover, New York.
[35] Kullback, S. and Leibler, R. A. (1951). On information and sufficiency. Ann. Math. Statist. 22 79–86.
Mathematical Reviews (MathSciNet):
MR39968
[36] Lindley, D. V. (1956). On a measure of information provided by an experiment. Ann. Math. Statist. 27 986–1005.
Mathematical Reviews (MathSciNet):
MR83936
[37] Schmidt, R. (1934). Statistical analysis if one-dimentional distributions. Ann. Math. Statist. 5 30–43.
[38] Shannon, C. E. (1948). A mathematical theory of communication. Bell System Tech. J. 27 379–423, 623–656.
Mathematical Reviews (MathSciNet):
MR26286
[39] Simpson, T. (1755). A letter to the right honourable George Earls of Maclesfield. President of the Royal Society, on the advantage of taking the mean of a number of observations in practical astronomy. Philos. Trans. 49 82–93.
[40] Stone, M. (1965). Right Haar measures for convergence in probability to invariant posterior distributions. Ann. Math. Statist. 36 440–453.
Mathematical Reviews (MathSciNet):
MR175213
[41] Stone, M. (1970). Necessary and sufficient condition for convergence in probability to invariant posterior distributions. Ann. Math. Statist. 41 1349–1353.
Mathematical Reviews (MathSciNet):
MR266359
[42] Sun, D. and Berger, J. O. (1998). Reference priors under partial information. Biometrika 85 55–71.
[43] Wasserman, L. (2000). Asymptotic inference for mixture models using data-dependent priors. J. Roy. Statist. Soc. Ser. B 62 159–180.