The Annals of Applied Probability

Optimal scaling for partially updating MCMC algorithms

Peter Neal and Gareth Roberts

Full-text: Open access


In this paper we shall consider optimal scaling problems for high-dimensional Metropolis–Hastings algorithms where updates can be chosen to be lower dimensional than the target density itself. We find that the optimal scaling rule for the Metropolis algorithm, which tunes the overall algorithm acceptance rate to be 0.234, holds for the so-called Metropolis-within-Gibbs algorithm as well. Furthermore, the optimal efficiency obtainable is independent of the dimensionality of the update rule. This has important implications for the MCMC practitioner since high-dimensional updates are generally computationally more demanding, so that lower-dimensional updates are therefore to be preferred. Similar results with rather different conclusions are given for so-called Langevin updates. In this case, it is found that high-dimensional updates are frequently most efficient, even taking into account computing costs.

Article information

Ann. Appl. Probab., Volume 16, Number 2 (2006), 475-515.

First available in Project Euclid: 29 June 2006

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60F05: Central limit and other weak theorems
Secondary: 65C05: Monte Carlo methods

Metropolis algorithm Langevin algorithm Markov chain Monte Carlo weak convergence optimal scaling


Neal, Peter; Roberts, Gareth. Optimal scaling for partially updating MCMC algorithms. Ann. Appl. Probab. 16 (2006), no. 2, 475--515. doi:10.1214/105051605000000791.

Export citation


  • Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.
  • Breyer, L. and Roberts, G. O. (2000). From Metropolis to diffusions: Gibbs states and optimal scaling. Stochastic Process. Appl. 90 181–206.
  • Durrett, R. (1991). Probability: Theory and Examples. Wadsworth and Brooks, Pacific Grove, CA.
  • Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes, Characterization and Convergence. Wiley, New York.
  • Hills, S. E. and Smith, A. F. M. (1992). Parameterization issues in Bayesian inference (with discussion). In Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 227–246. Oxford Univ. Press.
  • Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
  • Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. Roy. Statist. Soc. Ser. B 60 255–268.
  • Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16 351–367.
  • Roberts, G. O. and Sahu, S. K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. J. Roy. Statist. Soc. Ser. B 59 291–317.
  • Roberts, G. O. and Stramer, O. (2003). Langevin diffusions and Metropolis–Hastings algorithms. Methodol. Comput. Appl. Probab. 4 337–357.
  • Roberts, G. O. and Yuen, W. K. (2003). Optimal scaling of Metropolis algorithms for discontinuous densities. Unpublished manuscript.
  • Rogers, L. C. G. and Williams, D. (1987). Diffusions, Markov Processes, and Martingales. 2. Ito Calculus. Wiley, New York.