• Bernoulli
  • Volume 19, Number 5A (2013), 1501-1534.

Optimal tuning of the hybrid Monte Carlo algorithm

Alexandros Beskos, Natesh Pillai, Gareth Roberts, Jesus-Maria Sanz-Serna, and Andrew Stuart

Full-text: Open access


We investigate the properties of the hybrid Monte Carlo algorithm (HMC) in high dimensions. HMC develops a Markov chain reversible with respect to a given target distribution $\Pi$ using separable Hamiltonian dynamics with potential $-\log\Pi$. The additional momentum variables are chosen at random from the Boltzmann distribution, and the continuous-time Hamiltonian dynamics are then discretised using the leapfrog scheme. The induced bias is removed via a Metropolis–Hastings accept/reject rule. In the simplified scenario of independent, identically distributed components, we prove that, to obtain an $\mathcal{O}(1)$ acceptance probability as the dimension $d$ of the state space tends to $\infty$, the leapfrog step size $h$ should be scaled as $h=l\times d^{-1/4}$. Therefore, in high dimensions, HMC requires $\mathcal{O}(d^{1/4})$ steps to traverse the state space. We also identify analytically the asymptotically optimal acceptance probability, which turns out to be $0.651$ (to three decimal places). This value optimally balances the cost of generating a proposal, which decreases as $l$ increases (because fewer steps are required to reach the desired final integration time), against the cost related to the average number of proposals required to obtain acceptance, which increases as $l$ increases.

Article information

Bernoulli Volume 19, Number 5A (2013), 1501-1534.

First available in Project Euclid: 5 November 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Hamiltonian dynamics high dimensions optimal acceptance probability leapfrog scheme squared jumping distance


Beskos, Alexandros; Pillai, Natesh; Roberts, Gareth; Sanz-Serna, Jesus-Maria; Stuart, Andrew. Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli 19 (2013), no. 5A, 1501--1534. doi:10.3150/12-BEJ414.

Export citation


  • [1] Akhmatskaya, E., Bou-Rabee, N. and Reich, S. (2009). A comparison of generalized hybrid Monte Carlo methods with and without momentum flip. J. Comput. Phys. 228 2256–2265.
  • [2] Alexander, F.J., Eyink, G.L. and Restrepo, J.M. (2005). Accelerated Monte Carlo for optimal estimation of time series. J. Stat. Phys. 119 1331–1345.
  • [3] Barbour, A.D. and Chen, L.H.Y. (eds.) (2005). An Introduction to Stein’s Method. Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap. 4. Singapore: Singapore Univ. Press.
  • [4] Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17 1222–1244.
  • [5] Bédard, M. (2008). Efficient sampling using Metropolis algorithms: Applications of optimal scaling results. J. Comput. Graph. Statist. 17 312–332.
  • [6] Bédard, M. (2008). Optimal acceptance rates for Metropolis algorithms: Moving beyond 0.234. Stochastic Process. Appl. 118 2198–2222.
  • [7] Beskos, A., Pinski, F.J., Sanz-Serna, J.M. and Stuart, A.M. (2011). Hybrid Monte Carlo on Hilbert spaces. Stochastic Process. Appl. 121 2201–2230.
  • [8] Beskos, A., Roberts, G. and Stuart, A. (2009). Optimal scalings for local Metropolis–Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Probab. 19 863–898.
  • [9] Beskos, A., Roberts, G., Stuart, A. and Voss, J. (2008). MCMC methods for diffusion bridges. Stoch. Dyn. 8 319–350.
  • [10] Beskos, A. and Stuart, A. (2009). MCMC methods for sampling function space. In ICIAM 07 6th International Congress on Industrial and Applied Mathematics 337–364. Zürich: Eur. Math. Soc.
  • [11] Cancès, E., Legoll, F. and Stoltz, G. (2007). Theoretical and numerical comparison of some sampling methods for molecular dynamics. M2AN Math. Model. Numer. Anal. 41 351–389.
  • [12] Chen, L., Qin, Z. and Liu, J. (2000). Exploring hybrid Monte Carlo in Bayesian computation. In ISBA Proceedings, 2000.
  • [13] Christensen, O.F., Roberts, G.O. and Rosenthal, J.S. (2005). Scaling limits for the transient phase of local Metropolis–Hastings algorithms. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 253–268.
  • [14] Diaconis, P., Holmes, S. and Neal, R.M. (2000). Analysis of a nonreversible Markov chain sampler. Ann. Appl. Probab. 10 726–752.
  • [15] Duane, S., Kennedy, A.D., Pendleton, B. and Roweth, D. (1987). Hybrid Monte Carlo. Phys. Lett. B 195 216–222.
  • [16] Girolami, M. and Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 73 123–214. With discussion and a reply by the authors.
  • [17] Gupta, R., Kilcup, G.W. and Sharpe, S.R. (1988). Tuning the Hybrid Monte Carlo algorithm. Phys. Rev. D 38 1278–1287.
  • [18] Gupta, S., Irbäck, A., Karsch, F. and Petersson, B. (1990). The acceptance probability in the Hybrid Monte Carlo method. Phys. Lett. B 242 437–443.
  • [19] Hairer, E., Lubich, C. and Wanner, G. (2006). Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations, 2nd ed. Springer Series in Computational Mathematics 31. Berlin: Springer.
  • [20] Hairer, E., Nørsett, S.P. and Wanner, G. (1987). Solving Ordinary Differential Equations I: Nonstiff Problems. Springer Series in Computational Mathematics 8. Berlin: Springer.
  • [21] Hall, P. and Heyde, C.C. (1980). Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. New York: Academic Press.
  • [22] Hansmann, U.H.E., Okamoto, Y. and Eisenmenger, F. (1996). Molecular dynamics, Langevin and Hydrid Monte Carlo simulations in a multicanonical ensemble. Chem. Phys. Lett. 259 321–330.
  • [23] Hasenbusch, M. (2001). Speeding up the Hybrid Monte Carlo algorithm for dynamical fermions. Phys. Lett. B 519 177–182.
  • [24] Izaguirre, J.A. and Hampton, S.S. (2004). Shadow hybrid Monte Carlo: An efficient propagator in phase space of macromolecules. J. Comp. Phys. 200 581–604.
  • [25] Leimkuhler, B. and Reich, S. (2004). Simulating Hamiltonian Dynamics. Cambridge Monographs on Applied and Computational Mathematics 14. Cambridge: Cambridge Univ. Press.
  • [26] Liu, J.S. (2008). Monte Carlo Strategies in Scientific Computing. Springer Series in Statistics. New York: Springer.
  • [27] Mattingly, J.C., Pillai, N. and Stuart, A.M. (2012). Diffusion limits of random walk Metropolis algorithms in high dimensions. Ann. Appl. Probab. 22 881–930.
  • [28] Mehlig, B., Heermann, D.W. and Forrest, B.M. (1992). Exact Langevin algorithms. Molecular Phys. 76 1347–1357.
  • [29] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1092.
  • [30] Mohamed, L., Christie, M. and Demyanov, V. (2009). Comparison of stochastic sampling algorithms for uncertainty quantification. Technical report, Institute of Petroleum Engineering, Heriot-Watt Univ., Edinburgh. SPE Reservoir Simulation Symposium.
  • [31] Neal, R.M. (1993). Probabilistic inference using Markov chain Monte Carlo methods. Technical report, Dept. Computer Science, Univ. Toronto.
  • [32] Neal, R.M. (1996). Bayesian Learning for Neural Networks. New York: Springer.
  • [33] Pangali, C.S., Rao, M. and Berne, B.J. (1978). On a novel Monte Carlo scheme for simulating water and aqueous solutions. Chem. Phys. Lett. 55 413–417.
  • [34] Pasarica, C. and Gelman, A. (2010). Adaptively scaling the Metropolis algorithm using expected squared jumped distance. Statist. Sinica 20 343–364.
  • [35] Pillai, N.S., Stuart, A.M. and Thiery, A.H. (2012). Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions. To appear.
  • [36] Roberts, G.O., Gelman, A. and Gilks, W.R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
  • [37] Roberts, G.O. and Rosenthal, J.S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255–268.
  • [38] Roberts, G.O. and Rosenthal, J.S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16 351–367.
  • [39] Roberts, G.O. and Tweedie, R.L. (1996). Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2 341–363.
  • [40] Rossky, P.J., Doll, J.D. and Friedman, H.L. (1978). Brownian dynamics as smart Monte Carlo simulation. J. Chem. Phys. 69 4628–4633.
  • [41] Sanz-Serna, J.M. and Calvo, M.P. (1994). Numerical Hamiltonian Problems. Applied Mathematics and Mathematical Computation 7. London: Chapman & Hall.
  • [42] Schütte, C. (1998). Conformational dynamics: Modelling, theory, algorithm, and application of biomolecules. Habilitation thesis, Dept. Mathematics and Computer Science, Free Univ. Berlin.
  • [43] Sexton, J.C. and Weingarten, D.H. (1992). Hamiltonian evolution for the Hybrid Monte Carlo algorithm. Nuclear Phys. B 380 665–677.
  • [44] Skeel, R.D. (1999). Integration schemes for molecular dynamics and related applications. In The Graduate Student’s Guide to Numerical Analysis’98 (Leicester). Springer Ser. Comput. Math. 26 119–176. Berlin: Springer.
  • [45] Tuckerman, M.E., Berne, B.J., Martyna, G.J. and Klein, M.L. (1993). Efficient molecular dynamics and hybrid Monte Carlo algorithms for path integrals. J. Chem. Phys. 99 2796–2808.
  • [46] Zlochin, M. and Baram, Y. (2001). Manifold stochastic dynamics for Bayesian learning. Neural Comput. 13 2549–2572.