Bernoulli

  • Bernoulli
  • Volume 19, Number 5A (2013), 1501-1534.

Optimal tuning of the hybrid Monte Carlo algorithm

Alexandros Beskos, Natesh Pillai, Gareth Roberts, Jesus-Maria Sanz-Serna, and Andrew Stuart

Full-text: Open access

Abstract

We investigate the properties of the hybrid Monte Carlo algorithm (HMC) in high dimensions. HMC develops a Markov chain reversible with respect to a given target distribution $\Pi$ using separable Hamiltonian dynamics with potential $-\log\Pi$. The additional momentum variables are chosen at random from the Boltzmann distribution, and the continuous-time Hamiltonian dynamics are then discretised using the leapfrog scheme. The induced bias is removed via a Metropolis–Hastings accept/reject rule. In the simplified scenario of independent, identically distributed components, we prove that, to obtain an $\mathcal{O}(1)$ acceptance probability as the dimension $d$ of the state space tends to $\infty$, the leapfrog step size $h$ should be scaled as $h=l\times d^{-1/4}$. Therefore, in high dimensions, HMC requires $\mathcal{O}(d^{1/4})$ steps to traverse the state space. We also identify analytically the asymptotically optimal acceptance probability, which turns out to be $0.651$ (to three decimal places). This value optimally balances the cost of generating a proposal, which decreases as $l$ increases (because fewer steps are required to reach the desired final integration time), against the cost related to the average number of proposals required to obtain acceptance, which increases as $l$ increases.

Article information

Source
Bernoulli Volume 19, Number 5A (2013), 1501-1534.

Dates
First available in Project Euclid: 5 November 2013

Permanent link to this document
https://projecteuclid.org/euclid.bj/1383661192

Digital Object Identifier
doi:10.3150/12-BEJ414

Mathematical Reviews number (MathSciNet)
MR3129023

Zentralblatt MATH identifier
1287.60090

Keywords
Hamiltonian dynamics high dimensions optimal acceptance probability leapfrog scheme squared jumping distance

Citation

Beskos, Alexandros; Pillai, Natesh; Roberts, Gareth; Sanz-Serna, Jesus-Maria; Stuart, Andrew. Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli 19 (2013), no. 5A, 1501--1534. doi:10.3150/12-BEJ414. https://projecteuclid.org/euclid.bj/1383661192


Export citation

References

  • [1] Akhmatskaya, E., Bou-Rabee, N. and Reich, S. (2009). A comparison of generalized hybrid Monte Carlo methods with and without momentum flip. J. Comput. Phys. 228 2256–2265.
  • [2] Alexander, F.J., Eyink, G.L. and Restrepo, J.M. (2005). Accelerated Monte Carlo for optimal estimation of time series. J. Stat. Phys. 119 1331–1345.
  • [3] Barbour, A.D. and Chen, L.H.Y. (eds.) (2005). An Introduction to Stein’s Method. Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap. 4. Singapore: Singapore Univ. Press.
  • [4] Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17 1222–1244.
  • [5] Bédard, M. (2008). Efficient sampling using Metropolis algorithms: Applications of optimal scaling results. J. Comput. Graph. Statist. 17 312–332.
  • [6] Bédard, M. (2008). Optimal acceptance rates for Metropolis algorithms: Moving beyond 0.234. Stochastic Process. Appl. 118 2198–2222.
  • [7] Beskos, A., Pinski, F.J., Sanz-Serna, J.M. and Stuart, A.M. (2011). Hybrid Monte Carlo on Hilbert spaces. Stochastic Process. Appl. 121 2201–2230.
  • [8] Beskos, A., Roberts, G. and Stuart, A. (2009). Optimal scalings for local Metropolis–Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Probab. 19 863–898.
  • [9] Beskos, A., Roberts, G., Stuart, A. and Voss, J. (2008). MCMC methods for diffusion bridges. Stoch. Dyn. 8 319–350.
  • [10] Beskos, A. and Stuart, A. (2009). MCMC methods for sampling function space. In ICIAM 07 6th International Congress on Industrial and Applied Mathematics 337–364. Zürich: Eur. Math. Soc.
  • [11] Cancès, E., Legoll, F. and Stoltz, G. (2007). Theoretical and numerical comparison of some sampling methods for molecular dynamics. M2AN Math. Model. Numer. Anal. 41 351–389.
  • [12] Chen, L., Qin, Z. and Liu, J. (2000). Exploring hybrid Monte Carlo in Bayesian computation. In ISBA Proceedings, 2000.
  • [13] Christensen, O.F., Roberts, G.O. and Rosenthal, J.S. (2005). Scaling limits for the transient phase of local Metropolis–Hastings algorithms. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 253–268.
  • [14] Diaconis, P., Holmes, S. and Neal, R.M. (2000). Analysis of a nonreversible Markov chain sampler. Ann. Appl. Probab. 10 726–752.
  • [15] Duane, S., Kennedy, A.D., Pendleton, B. and Roweth, D. (1987). Hybrid Monte Carlo. Phys. Lett. B 195 216–222.
  • [16] Girolami, M. and Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 123–214. With discussion and a reply by the authors.
  • [17] Gupta, R., Kilcup, G.W. and Sharpe, S.R. (1988). Tuning the Hybrid Monte Carlo algorithm. Phys. Rev. D 38 1278–1287.
  • [18] Gupta, S., Irbäck, A., Karsch, F. and Petersson, B. (1990). The acceptance probability in the Hybrid Monte Carlo method. Phys. Lett. B 242 437–443.
  • [19] Hairer, E., Lubich, C. and Wanner, G. (2006). Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations, 2nd ed. Springer Series in Computational Mathematics 31. Berlin: Springer.
  • [20] Hairer, E., Nørsett, S.P. and Wanner, G. (1987). Solving Ordinary Differential Equations I: Nonstiff Problems. Springer Series in Computational Mathematics 8. Berlin: Springer.
  • [21] Hall, P. and Heyde, C.C. (1980). Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. New York: Academic Press.
  • [22] Hansmann, U.H.E., Okamoto, Y. and Eisenmenger, F. (1996). Molecular dynamics, Langevin and Hydrid Monte Carlo simulations in a multicanonical ensemble. Chem. Phys. Lett. 259 321–330.
  • [23] Hasenbusch, M. (2001). Speeding up the Hybrid Monte Carlo algorithm for dynamical fermions. Phys. Lett. B 519 177–182.
  • [24] Izaguirre, J.A. and Hampton, S.S. (2004). Shadow hybrid Monte Carlo: An efficient propagator in phase space of macromolecules. J. Comp. Phys. 200 581–604.
  • [25] Leimkuhler, B. and Reich, S. (2004). Simulating Hamiltonian Dynamics. Cambridge Monographs on Applied and Computational Mathematics 14. Cambridge: Cambridge Univ. Press.
  • [26] Liu, J.S. (2008). Monte Carlo Strategies in Scientific Computing. Springer Series in Statistics. New York: Springer.
  • [27] Mattingly, J.C., Pillai, N. and Stuart, A.M. (2012). Diffusion limits of random walk Metropolis algorithms in high dimensions. Ann. Appl. Probab. 22 881–930.
  • [28] Mehlig, B., Heermann, D.W. and Forrest, B.M. (1992). Exact Langevin algorithms. Molecular Phys. 76 1347–1357.
  • [29] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1092.
  • [30] Mohamed, L., Christie, M. and Demyanov, V. (2009). Comparison of stochastic sampling algorithms for uncertainty quantification. Technical report, Institute of Petroleum Engineering, Heriot-Watt Univ., Edinburgh. SPE Reservoir Simulation Symposium.
  • [31] Neal, R.M. (1993). Probabilistic inference using Markov chain Monte Carlo methods. Technical report, Dept. Computer Science, Univ. Toronto.
  • [32] Neal, R.M. (1996). Bayesian Learning for Neural Networks. New York: Springer.
  • [33] Pangali, C.S., Rao, M. and Berne, B.J. (1978). On a novel Monte Carlo scheme for simulating water and aqueous solutions. Chem. Phys. Lett. 55 413–417.
  • [34] Pasarica, C. and Gelman, A. (2010). Adaptively scaling the Metropolis algorithm using expected squared jumped distance. Statist. Sinica 20 343–364.
  • [35] Pillai, N.S., Stuart, A.M. and Thiery, A.H. (2012). Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions. To appear.
  • [36] Roberts, G.O., Gelman, A. and Gilks, W.R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
  • [37] Roberts, G.O. and Rosenthal, J.S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255–268.
  • [38] Roberts, G.O. and Rosenthal, J.S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16 351–367.
  • [39] Roberts, G.O. and Tweedie, R.L. (1996). Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2 341–363.
  • [40] Rossky, P.J., Doll, J.D. and Friedman, H.L. (1978). Brownian dynamics as smart Monte Carlo simulation. J. Chem. Phys. 69 4628–4633.
  • [41] Sanz-Serna, J.M. and Calvo, M.P. (1994). Numerical Hamiltonian Problems. Applied Mathematics and Mathematical Computation 7. London: Chapman & Hall.
  • [42] Schütte, C. (1998). Conformational dynamics: Modelling, theory, algorithm, and application of biomolecules. Habilitation thesis, Dept. Mathematics and Computer Science, Free Univ. Berlin.
  • [43] Sexton, J.C. and Weingarten, D.H. (1992). Hamiltonian evolution for the Hybrid Monte Carlo algorithm. Nuclear Phys. B 380 665–677.
  • [44] Skeel, R.D. (1999). Integration schemes for molecular dynamics and related applications. In The Graduate Student’s Guide to Numerical Analysis’98 (Leicester). Springer Ser. Comput. Math. 26 119–176. Berlin: Springer.
  • [45] Tuckerman, M.E., Berne, B.J., Martyna, G.J. and Klein, M.L. (1993). Efficient molecular dynamics and hybrid Monte Carlo algorithms for path integrals. J. Chem. Phys. 99 2796–2808.
  • [46] Zlochin, M. and Baram, Y. (2001). Manifold stochastic dynamics for Bayesian learning. Neural Comput. 13 2549–2572.