In this article, we establish the asymptotic normality of the posterior distribution for the natural parameter in an exponential family based on independent and identically distributed data. The mode of convergence is expected Kullback-Leibler distance and the number of parameters p is increasing with the sample size n. Using this, we give an asymptotic expansion of the Shannon mutual information valid when p=pn increases at a sufficiently slow rate. The second term in the asymptotic expansion is the largest term that depends on the prior and can be optimized to give Jeffreys’ prior as the reference prior in the absence of nuisance parameters. In the presence of nuisance parameters, we find an analogous result for each fixed value of the nuisance parameter. In three examples, we determine the rates at which pn can be allowed to increase while still retaining asymptotic normality and the reference prior property.
References
Berger, J. O. and J. M. Bernardo (1989). Estimating a product of means: Bayesian analysis with reference priors., J. Amer. Statist. Assoc. 84, 200–207.
Mathematical Reviews (MathSciNet):
MR999679
Berger, J. O. and J. M. Bernardo (1991). Reference priors in a variance components problem. In P. Goel and N. Iyengar (Eds.), Bayesian Inference in Statistics and Econometrics, pp. 177–194. New York: Springer.
Berger, J. O. and J. M. Bernardo (1992a). On the development of reference priors. In J. M. Bernardo, J. O. Berger, A. Dawid, and A. Smith (Eds.), Bayesian Statistics IV, pp. 36–60. Oxford: Clarendon Press.
Berger, J. O. and J. M. Bernardo (1992b). Ordered group reference priors with application to the multinomial., Biometrika 25, 25–37.
Berger, J. O., J. M. Bernardo, and M. Mendoza (1991). On priors that maximize expected information. In J. Klein and J. Lee (Eds.), Recent Developments in Statistics and Their Applications, pp. 1–20. Seoul: Freedom Academy.
Berger, J. O., J. M. Bernardo, and D. Sun (2009). The formal definition of reference priors., Ann. Statist. 37, 905–938.
Bernardo, J. M. (1979). Reference posterior distributions for Bayesian inference., J. Roy. Statist. Soc. B 41, 113–147.
Mathematical Reviews (MathSciNet):
MR547240
Bernardo, J. M. (2010). Integrated objective Bayesian estimation and hypothesis testing. In J. M. Bernardo, J. O. Berger, A. P. D. Dawid, and A. F. M. Smith (Eds.), Bayesian Statistics IX, Oxford. Clarendon Press.
Boucheron, S. and E. Gassiat (2009). A Bernstein-von Mises theorem for discrete probability distributions., Elec. J. Statist. 3, 114–148.
Brown, L. D. (1986)., Fundamentals of Statistical Exponential Families. Vol. 9, Lecture Notes –Monograph Series. Hayward, CA: Institute of Mathematical Statistics.
Mathematical Reviews (MathSciNet):
MR882001
Chen, M.-H., J. Ibrahim, and S. Kim (2009). Properties and implementation of Jeffreys’ prior in binomial regression models., J. Amer. Stat. Assoc. 103, 1659–1664.
Clarke, B. and A. Barron (1990). Information-theoretic asymptotics of Bayes methods., IEEE Trans. Inform. Theory 36, 453–471.
Clarke, B. and A. Barron (1994). Jeffreys’ prior is the reference prior under entropy loss., J. Stat. Planning and Inference 41, 37–60.
Clarke, B. and D. Sun (1997). Reference priors under the chi-square distance., Sankhya 59, 215–231.
Clarke, B. and A. Yuan (2004). Partial information reference priors: derivation and interpretations., J. Stat. Plann. Inf. 123, 313–345.
Geisser, S. and J. Cornfield (1963). Posterior distributions for multivariate normal parameters., J. Roy. Stat. Soc. Ser. B 25, 368–376.
Mathematical Reviews (MathSciNet):
MR171354
Gelman, A., J. Carlin, S. Stern, and D. Rubin (2004)., Bayesian Data Analysis. Boca Raton, FL: Chapman and Hall.
George, E. and R. McCulloch (1993). On obtaining invariant prior distributions., J. Statist. Plann. Inf. 37, 169–179.
Ghosal, S. (1997). Normal approximation to the posterior distribution for generalized linear models with many covariates., Math. Methods Statist. 6, 332–348.
Ghosal, S. (1999). Asymptotic normality of posterior distributions in high dimensional linear models., Bernoulli 5, 315–331.
Ghosal, S. (2000). Asymptotic normality of posterior distributions for exponential families when the number of parameters tends to infinity., J. Multivariate Anal. 74, 49–68.
Ghosal, S., J. K. Ghosh, and R. V. Ramamoorthi (1997). Non-informative priors via sieves and packing numbers. In S. Panchapakesan and N. Balakrishnan (Eds.), Advances in Statistical Decision Theory and Applications, pp. 119–132. New York: Birkhauser.
Ghosal, S., J. K. Ghosh, and A. W. van der Vaart (2000). Convergence rates of posterior distributions., Ann. Statist. 30 (2), 500–531.
Ghosh, J. K. and R. Mukerjee (1992). Noninformative priors. In J. M. Bernardo, J. O. Berger, A. P. D. Dawid, and A. F. M. Smith (Eds.), Bayesian Statistics IV, Oxford, pp. 195–210. Clarendon Press.
Ghosh, J. K. and R. V. Ramamoorthi (2003)., Bayesian Nonparametrics. New York, NY: Springer.
Ghosh, M., V. Mergel, and R. Liu (2010). A general divergence criterion for prior selection., To appear: Ann. Inst. Stat. Math. .
Guan, Y. and J. Dy (2009). Sparse probabilistic principal component analysis. In, JMLR Workshop and Conference Proceedings Vol. 5: AISTATS, pp. 185–192.
Heo, T. and J. Kim (2007). Bayesian inference for multinomial group testing., Korean Communications in Statistics 14, 81–92.
Ibragimov, I. and R. Hasminsky (1973). On the information in a sample about a parameter. In, Proc. 2nd Internat. Symp. on Information Theory, Budapest, pp. 295–309. Akademiai, Kiado.
Mathematical Reviews (MathSciNet):
MR356948
Lindley, D. (1956). On a measure of the information provided by an experiment., Ann. Math. Statist. 27, 986–1005.
Mathematical Reviews (MathSciNet):
MR83936
Ortega, J. and W. Rheinboldt (1970)., Iterative Solution of Nonlinear Equations in Several Variables. New York, NY: Academic Press.
Mathematical Reviews (MathSciNet):
MR273810
Portnoy, S. (1988). Asymptotic behavior of likelihood methods for exponential families when the number of parameters tends to infinity., Ann. Statist. 16, 356–366.
Mathematical Reviews (MathSciNet):
MR924876
Shannon, C. (1948a). A mathematical theory of communication, part i., Bell Syst. Tech. J. 27, 379 – 423.
Mathematical Reviews (MathSciNet):
MR26286
Shannon, C. (1948b). A mathematical theory of communication, part ii., Bell Syst. Tech. J 27, 623 – 656.
Mathematical Reviews (MathSciNet):
MR26286
Sono, S. (1983). On a non-informative prior distribution for Bayesian inference of multinomial distribution parameters., Ann. Inst. Statist. Math. 35 (Part A), 167–174.
Mathematical Reviews (MathSciNet):
MR716027
Sun, D. and J. O. Berger (1998). Reference priors with partial information., Biometrika 85, 55–71.
Yang, R. and J. O. Berger (1994). Estimation of a covariance matrix using a reference prior., Ann. Statist. 22, 1195–1211.
Zhang, Z. (1994)., Discrete Noninformative Priors. Ph. D. thesis, Department of Statistics, Yale.
Zhu, M. and A. Lu (2004). The counter-intuitive non-informative prior for the Bernoulli family., J. Stat. Ed. 12, 1–10.