Statistical Science

Simulating normalizing constants: from importance sampling to bridge sampling to path sampling

Andrew Gelman and Xiao-Li Meng

Full-text: Open access

Abstract

Computing (ratios of) normalizing constants of probability models is a fundamental computational problem for many statistical and scientific studies. Monte Carlo simulation is an effective technique, especially with complex and high-dimensional models. This paper aims to bring to the attention of general statistical audiences of some effective methods originating from theoretical physics and at the same time to explore these methods from a more statistical perspective, through establishing theoretical connections and illustrating their uses with statistical problems. We show that the acceptance ratio method and thermodynamic integration are natural generalizations of importance sampling, which is most familiar to statistical audiences. The former generalizes importance sampling through the use of a single "bridge" density and is thus a case of bridge sampling in the sense of Meng and Wong. Thermodynamic integration, which is also known in the numerical analysis literature as Ogata's method for high-dimensional integration, corresponds to the use of infinitely many and continuously connected bridges (and thus a "path"). Our path sampling formulation offers more flexibility and thus potential efficiency to thermodynamic integration, and the search of optimal paths turns out to have close connections with the Jeffreys prior density and the Rao and Hellinger distances between two densities. We provide an informative theoretical example as well as two empirical examples (involving 17- to 70-dimensional integrations) to illustrate the potential and implementation of path sampling. We also discuss some open problems.

Article information

Source
Statist. Sci. Volume 13, Number 2 (1998), 163-185.

Dates
First available in Project Euclid: 9 August 2002

Permanent link to this document
https://projecteuclid.org/euclid.ss/1028905934

Digital Object Identifier
doi:10.1214/ss/1028905934

Mathematical Reviews number (MathSciNet)
MR1647507

Zentralblatt MATH identifier
0966.65004

Keywords
Acceptance ratio method Hellinger distance Jeffreys prior density Markov chain Monte Carlo numerical integration Rao distance thermodynamic integration

Citation

Gelman, Andrew; Meng, Xiao-Li. Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Statist. Sci. 13 (1998), no. 2, 163--185. doi:10.1214/ss/1028905934. https://projecteuclid.org/euclid.ss/1028905934.


Export citation

References

  • Atkinson, C. and Mitchell, A. F. S. (1981). Rao's distance measure. Sankhy¯a Ser. A 43 345-365.
  • Bennett, C. H. (1976). Efficient estimation of free energy differences from Monte Carlo data. J. Comput. Phy s. 22 245-268.
  • Berg, B. and Celik, T. (1992). New approach to spin-glass simulations. Phy s. Rev. Lett. 69 2292-2295.
  • Berg, B. and Neuhaus, T. (1991). Multicanonical algorithms for the first order phase transitions. Phy s. Lett. B 267 249-253.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice sy stems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192-236.
  • Binder, K. (1986). Introduction: theory and technical aspects of Monte Carlo simulations. In Monte Carlo Methods in Statistical physics (K. Binder, ed.). Topics in Current physics 7. Springer, Berlin.
  • Boscardin, W. J. and Gelman, A. (1996). Bayesian regression with parametric models for heteroscedasticity. Advances in Econometrics 11A 87-109.
  • Burbea, J. (1989). Rao distance. In Ency clopedia of Statistical Science, supplement volume, 128-130. Wiley, New York.
  • Ceperley, D. M. (1995). Path integrals in the theory of condensed helium. Rev. Modern Phy s. 67 279-355. Chen, M.-H. and Shao, Q. M. (1997a). On Monte Carlo methods for estimating ratios of normalizing constants. Ann. Statist. 25 1563-1594. Chen, M.-H. and Shao, Q. M. (1997b). Estimating ratios of normalizing constants for densities with different dimensions. Statist. Sinica 7 607-630.
  • Chib, S. (1995). Marginal likelihood from the Gibbs output. J. Amer. Statist. Assoc. 90 1313-1321.
  • Ciccotti, G. and Hoover, W. G., eds. (1986). MolecularDy namics Simulation of Statistical-Mechanical Sy stems. North-Holland, Amsterdam.
  • Courant, R. and Hilbert, D. (1961). Methods of Mathematical physics 2. Wiley, New York.
  • Dempster, A. P., Selwy n, M. R. and Weeks, B. J. (1983). Combining historical and randomized controls for assessing trends in proportions. J. Amer. Statist. Assoc. 78 221-227. DiCiccio, T. J., Kass, R. E., Raftery, A. and Wasserman, L.
  • (1997). Computing Bay es factors by combining simulation and asy mptotic approximations. J. Amer. Statist. Assoc.. 92 903-915.
  • Evans, M. and Swartz, T. (1995). Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems. Statist. Sci. 10 254-272.
  • Frankel, D. and Smit, B. (1996). Understanding Molecular Simulation. Academic Press, New York.
  • Frenkel, D. (1986). Free-energy computation and first-order phase transition. In Molecular-Dy namics Simulation of Statistical-Mechanical Sy stems (G. Ciccotti and W. G. Hoover, eds.) 151-188. North-Holland, Amsterdam.
  • Gelfand, A. E. and Dey, D. K. (1994). Bayesian model choice: asy mptotic and exact calculations. J. Roy. Statist. Soc. Ser. B 56 501-514.
  • Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85 398-409.
  • Gelman, A., King, G. and Boscardin, J. (1998). Estimating the probability of events that have never occurred: when is your vote decisive? J. Amer. Statist. Assoc. 93 1-9.
  • Gelman, A. and Meng, X. L. (1994). Path sampling for computing normalizing constants: identities and theory. Technical Report 376, Dept. Statistics, Univ. Chicago.
  • Gelman, A., Roberts, G. O. and Gilks, W. R. (1996). Efficient Metropolis jumping rules. In Bayesian Statistics 5 (J. O. Berger, J. M. Bernardo, D. V. Lindley and A. F. M. Smith, eds.) 599-607. Oxford Univ. Press.
  • Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statist. Sci. 7 457-511.
  • Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analy sis and Machine Intelligence 6 721-741.
  • Gey er, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In Computing Science and Statistics: Proceedings of the 23rd Sy mposium on the Interface (E. M. Keramidas, ed.) 156-163. Interface Foundation, Fairfax Station, VA.
  • Gey er, C. J. (1994). Estimating normalizing constants and reweighting mixtures in Markov chain Monte Carlo. Technical Report 568, School of Statistics, Univ. Minnesota.
  • Gey er, C. J. and Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood for dependent data (with discussion). J. Roy. Statist. Soc. Ser. B 54 657-699.
  • Gey er, C. J. and Thompson, E. A. (1995). Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Amer. Statist. Assoc. 90 909-920.
  • Green, P. J. (1992). Discussion of "Constrained Monte Carlo maximum likelihood for dependent data" by C. J. Gey er and E. A. Thompson. J. Roy. Statist. Soc. Ser. B 54 683-684.
  • Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97- 109.
  • Irwin, M., Cox, N. and Kong, A. (1994). Sequential imputation for multilocus linkage analysis. Proc. Nat. Acad. Sci. U.S.A. 91 11684-11688.
  • Jensen, C. S. and Kong, A. (1998). Blocking Gibbs sampling for linkage analysis in large pedigrees with many loops. American Journal of Human Genetics. To appear.
  • Kass, R. E. and Vos, P. W. (1997). Geometry Foundations of Asy mptotic Inference. Wiley, New York.
  • Kong, A., Liu, J. and Wong, W. H. (1994). Sequential imputations and Bayesian missing data problems. J. Amer. Statist. Assoc. 89 278-288.
  • Lewis, S. M. and Raftery, A. E. (1997). Estimating Bay es factors via posterior simulation with Laplace-Metropolis estimator. J. Amer. Statist. Assoc. 92 648-663.
  • Marinari, E. and Parisi, G. (1992). Simulated tempering: a new Monte Carlo scheme. Europhy s. Lett. 19 451-458.
  • Meng, X. L. and Schilling, S. (1996). Fitting full-information factor models and an empirical investigation of bridge sampling. J. Amer. Statist. Assoc. 91 1254-1267.
  • Meng, X. L. and Wong, W. H. (1996). Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statist. Sinica 6 831-860. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N.,
  • Teller, A. H. and Teller, E. (1953). Equation of state calculations by fast computing machines. Journal of Chemical physics 21 1087-1092.
  • Mitchell, A. F. S. (1992). Estimative and predictive distances. Test 1 105-121.
  • Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, Dept. Computer Science, Univ. Toronto.
  • Newton, M. A. and Raftery, A. E. (1994). Approximate Bayesian inference and the weighted likelihood bootstrap (with discussion). J. Roy. Statist. Soc. Ser. B 56 3-48.
  • Ogata, Y. (1989). A Monte Carlo method for high dimensional integration. Numer. Math. 55 137-157.
  • Ogata, Y. (1990). A Monte Carlo method for an objective Bayesian procedure. Ann. Inst. Statist. Math. 42 403-433.
  • Ogata, Y. (1994). Evaluation of Bayesian visualization modelsTwo computational methods. Research Memorandum 503, Institute of Statistical Mathematics, Toky o.
  • Ogata, Y. and Tanemura, M. (1984). Likelihood analysis of spatial point patterns. J. Roy. Statist. Soc. Ser. B 46 496-518.
  • Ott, J. (1979). Maximum likelihood estimation by counting methods under poly genic and mixed models in human pedigrees. American Journal of Human Genetics 31 161-175.
  • Raftery, A. E. (1996). Hy pothesis testing and model selection via posterior simulation. In Practical Markov Chain Monte Carlo (W. Gilks, S. Richardson and D. J. Spiegelhalter, eds) 163-187. Chapman and Hall, London.
  • Rao, C. R. (1945). Information and the accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math. Soc. 37 81-91.
  • Rao, C. R. (1949). On the distance between two populations. Sankhy¯a 9 246-248.
  • Ripley, B. D. (1988). Statistical Inference for Spatial Processes. Cambridge Univ. Press.
  • Stein, M. (1992). Prediction and inference for truncated spatial data. J. Comput. Graph. Statist. 1 91-110.
  • Thompson, E. A. (1996). Likelihood and linkage: from Fisher to the future. Ann. Statist. 24 449-465.
  • Torrie, G. M. and Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: umbrella sampling. J. Comput. Phy s. 23 187-199.