The Annals of Statistics

On the efficiency of pseudo-marginal random walk Metropolis algorithms

Chris Sherlock, Alexandre H. Thiery, Gareth O. Roberts, and Jeffrey S. Rosenthal

Full-text: Open access


We examine the behaviour of the pseudo-marginal random walk Metropolis algorithm, where evaluations of the target density for the accept/reject probability are estimated rather than computed precisely. Under relatively general conditions on the target distribution, we obtain limiting formulae for the acceptance rate and for the expected squared jump distance, as the dimension of the target approaches infinity, under the assumption that the noise in the estimate of the log-target is additive and is independent of the position. For targets with independent and identically distributed components, we also obtain a limiting diffusion for the first component.

We then consider the overall efficiency of the algorithm, in terms of both speed of mixing and computational time. Assuming the additive noise is Gaussian and is inversely proportional to the number of unbiased estimates that are used, we prove that the algorithm is optimally efficient when the variance of the noise is approximately 3.283 and the acceptance rate is approximately 7.001%. We also find that the optimal scaling is insensitive to the noise and that the optimal variance of the noise is insensitive to the scaling. The theory is illustrated with a simulation study using the particle marginal random walk Metropolis.

Article information

Ann. Statist., Volume 43, Number 1 (2015), 238-275.

First available in Project Euclid: 9 December 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 65C05: Monte Carlo methods 65C40: Computational Markov chains 60F05: Central limit and other weak theorems

Markov chain Monte Carlo MCMC pseudo-marginal random walk Metropolis optimal scaling diffusion limit particle methods


Sherlock, Chris; Thiery, Alexandre H.; Roberts, Gareth O.; Rosenthal, Jeffrey S. On the efficiency of pseudo-marginal random walk Metropolis algorithms. Ann. Statist. 43 (2015), no. 1, 238--275. doi:10.1214/14-AOS1278.

Export citation


  • [1] Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 269–342.
  • [2] Andrieu, C. and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37 697–725.
  • [3] Andrieu, C. and Vihola, M. (2014). Convergence properties of pseudo marginal Markov chain Monte Carlo algorithms. Preprint. Available at arXiv:1210.1484.
  • [4] Beaumont, M. A. (2003). Estimation of population growth or decline in genetically monitored populations. Genetics 164 1139–1160.
  • [5] Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17 1222–1244.
  • [6] Bédard, M. and Rosenthal, J. S. (2008). Optimal scaling of Metropolis algorithms: Heading toward general target distributions. Canad. J. Statist. 36 483–503.
  • [7] Bérard, J., Del-Moral, P. and Doucet, A. (2013). A lognormal central limit theorem for particle approximations of normalizing constants. Preprint. Available at arXiv:1307.0181.
  • [8] Beskos, A., Roberts, G. and Stuart, A. (2009). Optimal scalings for local Metropolis–Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Probab. 19 863–898.
  • [9] Breyer, L. A., Piccioni, M. and Scarlatti, S. (2004). Optimal scaling of MaLa for nonlinear regression. Ann. Appl. Probab. 14 1479–1505.
  • [10] Breyer, L. A. and Roberts, G. O. (2000). From Metropolis to diffusions: Gibbs states and optimal scaling. Stochastic Process. Appl. 90 181–206.
  • [11] Brooks, S., Gelman, A., Jones, G. L. and Meng, X.-L., eds. (2011). Handbook of Markov Chain Monte Carlo. CRC Press, Boca Raton, FL.
  • [12] Ceperley, D. M. and Dewing, M. (1999). The penalty method for random walks with uncertain energies. The Journal of Chemical Physics 110 9812.
  • [13] Del Moral, P. (2004). Feynman–Kac Formulae: Genealogical and Interacting Particle Systems with Applications. Springer, New York.
  • [14] Doucet, A., Pitt, M., Deligiannidis, G. and Kohn, R. (2014). Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Preprint. Available at arXiv:1210.1871v4.
  • [15] Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence. Wiley, New York.
  • [16] Fearnhead, P., Papaspiliopoulos, O. and Roberts, G. O. (2008). Particle filters for partially observed diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 755–777.
  • [17] Golightly, A. and Wilkinson, D. J. (2011). Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo. Interface Focus 1 807–820.
  • [18] Gordon, N. J., Salmond, D. J. and Smith, A. F. M. (1993). Novel approach to nonlinear/non-Gaussian Bayesian state estimation. Radar and Signal Processing, IEE Proceedings F 140 107–113.
  • [19] Knape, J. and de Valpine, P. (2012). Fitting complex population models by combining particle filters with Markov chain Monte Carlo. Ecology 93 256–263.
  • [20] Li, N. and Stephens, M. (2003). Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165 2213–2233.
  • [21] Nicholls, G. K., Fox, C. and Watt, A. M. (2012). Coupled MCMC with a randomized acceptance probability. Preprint. Available at arXiv:1205.6857.
  • [22] Pasarica, C. and Gelman, A. (2010). Adaptively scaling the Metropolis algorithm using expected squared jumped distance. Statist. Sinica 20 343–364.
  • [23] Pillai, N. S., Stuart, A. M. and Thiéry, A. H. (2012). Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions. Ann. Appl. Probab. 22 2320–2356.
  • [24] Pitt, M. K., Silva, R. d. S., Giordani, P. and Kohn, R. (2012). On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econometrics 171 134–151.
  • [25] Poyiadjis, G., Doucet, A. and Singh, S. S. (2011). Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika 98 65–80.
  • [26] Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
  • [27] Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255–268.
  • [28] Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16 351–367.
  • [29] Roberts, G. O. and Rosenthal, J. S. (2014). Minimising MCMC variance via diffusion limits, with an application to simulated tempering. Ann. Appl. Probab. 24 131–149.
  • [30] Roberts, G. O. and Rosenthal, J. S. (2014). Complexity bounds for MCMC via diffusion limits. Available at
  • [31] Sherlock, C. (2013). Optimal scaling of the random walk Metropolis: General criteria for the 0.234 acceptance rule. J. Appl. Probab. 50 1–15.
  • [32] Sherlock, C., Fearnhead, P. and Roberts, G. O. (2010). The random walk Metropolis: Linking theory and practice through a case study. Statist. Sci. 25 172–190.
  • [33] Sherlock, C. and Roberts, G. (2009). Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets. Bernoulli 15 774–798.
  • [34] Smith, A. F. M. and Roberts, G. O. (1993). Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 55 3–23.
  • [35] Tierney, L. (1994). Markov chains for exploring posterior distributions. Ann. Statist. 22 1701–1762.