Bayesian Analysis

Estimating the Marginal Likelihood Using the Arithmetic Mean Identity

Anna Pajor

Full-text: Open access

Abstract

In this paper we propose a conceptually straightforward method to estimate the marginal data density value (also called the marginal likelihood). We show that the marginal likelihood is equal to the prior mean of the conditional density of the data given the vector of parameters restricted to a certain subset of the parameter space, A, times the reciprocal of the posterior probability of the subset A. This identity motivates one to use Arithmetic Mean estimator based on simulation from the prior distribution restricted to any (but reasonable) subset of the space of parameters. By trimming this space, regions of relatively low likelihood are removed, and thereby the efficiency of the Arithmetic Mean estimator is improved. We show that the adjusted Arithmetic Mean estimator is unbiased and consistent.

Article information

Source
Bayesian Anal. Volume 12, Number 1 (2017), 261-287.

Dates
First available in Project Euclid: 4 April 2016

Permanent link to this document
https://projecteuclid.org/euclid.ba/1459772735

Digital Object Identifier
doi:10.1214/16-BA1001

Keywords
Bayesian inference Bayesian model selection marginal likelihood

Rights
Creative Commons Attribution 4.0 International License.

Citation

Pajor, Anna. Estimating the Marginal Likelihood Using the Arithmetic Mean Identity. Bayesian Anal. 12 (2017), no. 1, 261--287. doi:10.1214/16-BA1001. https://projecteuclid.org/euclid.ba/1459772735


Export citation

References

  • Ardia, D., Bastürk, N., Hoogerheide, L., and van Dijk, H. K. (2012). “A comparative study of Monte Carlo methods for efficient evaluation of marginal likelihood.” Computational Statistics and Data Analysis, 56: 3398–3414.
  • Bartolucci, F., Scaccia, L., and Mira, A. (2006). “Efficient Bayes factor estimation from the Reversible Jump output.” Biometrika, 93: 41–52.
  • Berkhof, J., van Mechelen, I., and Gelman, A. (2003). “A Bayesian approach to the selection and testing of mixture models.” Statistica Sinica, 13:423–442.
  • Carlin, B. P. and Chib, S. (1995). “Bayesian Model Choice via Markov Chain Monte Carlo.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 57: 473–484.
  • Chen, M.-H. (1994). “Importance-weighted marginal Bayesian posterior density estimation.” Journal of the American Statistical Association, 89: 818–824.
  • Chen, M.-H. (2005). “Computing marginal likelihoods from a single MCMC output.” Statistica Neerlandica, 59: 16–29.
  • Chib, S. (1995). “Marginal likelihood from the Gibbs output.” Journal of the American Statistical Association, 90: 1313–1321.
  • Chib, S. and Jeliazkov, I. (2001). “Marginal likelihood from the Metropolis–Hastings output.” Journal of the American Statistical Association, 96: 270–281.
  • Chung, H., Loken, E., and Schafer, J. L. (2004). “Difficulties in Drawing Inferences With Finite–Mixture Models: A Simple Example with a Simple Solution.” The American Statistician, 58: 152–158.
  • Celeux, G., Hurn, M., and Robert, C. (2000). “Computational and inferential difficulties with mixtures posterior distribution.” Journal of the American Statistical Association, 95(3):957–979.
  • Friel, N. and Pettitt, A. N. (2008). “Marginal likelihood estimation via power posteriors.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70: 589–607.
  • Friel, N. and Wyse, J. (2012). “Estimating the evidence: a review.” Statistica Neerlandica, 66(3): 288–308.
  • Frühwirth-Schnatter, S. (2001). “ Markov Chain Monte Carlo Estimation of Classical and Dynamic Switching and Mixture Models.” Journal of the American Statistical Association, 96: 194–209.
  • Frühwirth-Schnatter, S. (2006). Finite mixture and Markov switching models. New York: Springer.
  • Gelman, A. and Meng, X.-L. (1998). “Simulating normalizing constants: From importance sampling to bridge sampling to path sampling.” Statistical Science, 13: 163–185.
  • Green, P. J. (1995). “Reversible jump Markov chain Monte Carlo computation and Bayesian model determination.” Biometrika, 82: 711–732.
  • Green, P. J. and O’Hagan, A. (1998). “Carlin and Chib do not need to sample from pseudopriors.” Research Report 98-1, University of Nottingham, Nottingham.
  • Hammersley, J. M. and Handscomb, D. C. (1964). Monte Carlo Methods. London: Methuen and Co LTD.
  • Han, C. and Carlin, B. P. (2001). “MCMC methods for computing Bayes factors: A comparative review.” Journal of the American Statistical Association, 96: 1122–1132.
  • Jasra, A., Holmes, C., and Stephens, D. (2005). “Markov Chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling.” Statistical Science, 20(1):50–67.
  • Kass, R. and Raftery, A. (1995) “Bayes factors.” Journal of the American Statistical Association, 90: 773–795.
  • Lenk P. (2009) “Simulation pseudo-bias correction to the harmonic mean estimator of integrated likelihoods.” Journal of Computational and Graphical Statistics, 18: 941–960.
  • Lewis, S. M. and Raftery, A. E. (1997). “Estimating Bayes factors via posterior simulation with the Laplace–Metropolis estimator.” Journal of the American Statistical Association, 92: 648–655.
  • Liang, F. and Wong, W. H. (2001). “Real-Parameter Evolutionary Monte Carlo with Applications to Bayesian Mixture Models.” Journal of the American Statistical Association, 96: 653–666.
  • Marin, J.-M., Mengersen, K., and Robert, C. (2005). “Bayesian modelling and inference on mixtures of distributions.” In: Rao, C. and Dey, D. (eds.), Handbook of Statistics, 25: 459–507, Springer-Verlag, New York.
  • McCulloch, R. E. and Rossi, P. E. (1992). “Bayes factors for nonlinear hypotheses and likelihood distributions.” Biometrika, 49(4): 663–676.
  • Meng, X.-L. and Wong, W.-H. (1996). “Simulating ratios of normalizing constants: a theoretical exploration.” Statistica Sinica, 6: 831–860.
  • Neal, R. M. (1999). “Erroneous Results in “Marginal Likelihood from the Gibbs Output”.” Available online at http://www.cs.toronto.edu/~radford/ftp/chib-letter.pdf.
  • Neal, R. M. (2001). “Annealed importance sampling.” Statistics and Computing, 11: 125–139.
  • Newton, M. A. and Raftery, A. E. (1994). “Approximate Bayesian inference by the weighted likelihood bootstrap [with discussion].” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 56(1): 3–48.
  • Pajor, A. (2016). “Supplementary Material of “Estimating the Marginal Likelihood Using the Arithmetic Mean Identity”.” Bayesian Analysis.
  • Pajor, A. and Osiewalski, J. (2013). “A Note on Lenk’s Correction of the Harmonic Mean Estimator.” Central European Journal of Economic Modelling and Econometrics, 5(4): 271–275.
  • Phillips, D. B. and Smith, A. F. M. (1996). “Bayesian model comparison via jump diffusions.” In: Gilks, W. R., Spiegelhalter, D. J. and Richardson S. (eds.), Markov Chain Monte Carlo in Practice, 215–239, London, Chapman and Hall.
  • Postman, M., Huchra, J., and Geller, M. (1986). “Probes of Large-Scale Structure in the Corona Borealis Region.” The Astronomical Journal, 92: 1238–1247.
  • Raftery, A. E. (1996). “Hypothesis testing and model selection.” In: Gilks, W. R., Spiegelhalter, D. J. and Richardson S. (eds.), Markov Chain Monte Carlo in Practice, 163–188, London, Chapman and Hall.
  • Raftery, A. E. and Banfield, J. D. (1991). “Stopping the Gibbs sampler, the use of morphology and other issues in spatial statistics.” Annals of the Institute of Statistical Mathematics, 43: 32–43.
  • Raftery, A. E., Newton, M. A., Satagopan, J. M., and Krivitsky, P. N. (2007). “Estimating the Integral Likelihood via Posterior Simulation Using the Harmonic Mean Identity.” In: Bernardo, M., Bayarri, M. J., Berger,J. O., Dawid, A. P., Heckerman, D., Smith, A. F. M., and West M. (eds.), Bayesian Statistics, 8, 1–45, Oxford University Press.
  • Richardson, S., and Green, P. J. (1997). “On Bayesian Analysis of Mixtures with an Unknown Number of Components.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59: 731–758.
  • Roeder, K. (1990). “Density Estimation With Confidence Sets Exemplified by Superclusters and Voids in the Galaxies.” Journal of the American Statistical Association, 85: 617–624.
  • Steele, R. J., Raftery, A. E., and Emond, M. J. (2006). “Computing Normalizing Constants for Finite Mixture Models via Incremental Mixture Importance Sampling (IMIS).” Journal of Computational and Graphical Statistics, 15: 712–734.
  • Stephens, M. (2000). “Dealing with label switching in mixture models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 62: 795–809.
  • Weinberg, M.D. (2012). “Computing the Bayes Factor from a Markov Chain Monte Carlo Simulation of the Posterior Distribution.” Bayesian Analysis, 7(3): 737–770.
  • Williams, E. (1959). “Regression Analysis.” New York: John Wiley and Sons, INC.
  • Xie, W., Lewis, O. P., Fan, Y., Kuo, L., and Chen, M-H. (2011). “Improving Marginal Likelihood Estimation for Bayesian Phylogenetic Model Selection.” Systematic Biology, 60(20): 150–160.
  • Zellner, A. (1971). An Introduction to Bayesian Inference in Econometrics. New York: John Wiley and Sons, INC.

Supplemental materials