Bernoulli

  • Bernoulli
  • Volume 20, Number 4 (2014), 1930-1978.

Optimal scaling for the transient phase of Metropolis Hastings algorithms: The longtime behavior

Benjamin Jourdain, Tony Lelièvre, and Błażej Miasojedow

Full-text: Open access

Abstract

We consider the Random Walk Metropolis algorithm on $\mathbb{R}^{n}$ with Gaussian proposals, and when the target probability measure is the $n$-fold product of a one-dimensional law. It is well known (see Roberts et al. (Ann. Appl. Probab. 7 (1997) 110–120)) that, in the limit $n\to\infty$, starting at equilibrium and for an appropriate scaling of the variance and of the timescale as a function of the dimension $n$, a diffusive limit is obtained for each component of the Markov chain. In Jourdain et al. (Optimal scaling for the transient phase of the random walk Metropolis algorithm: The mean-field limit (2012) Preprint), we generalize this result when the initial distribution is not the target probability measure. The obtained diffusive limit is the solution to a stochastic differential equation nonlinear in the sense of McKean. In the present paper, we prove convergence to equilibrium for this equation. We discuss practical counterparts in order to optimize the variance of the proposal distribution to accelerate convergence to equilibrium. Our analysis confirms the interest of the constant acceptance rate strategy (with acceptance rate between $1/4$ and $1/3$) first suggested in Roberts et al. (Ann. Appl. Probab. 7 (1997) 110–120).

We also address scaling of the Metropolis-Adjusted Langevin Algorithm. When starting at equilibrium, a diffusive limit for an optimal scaling of the variance is obtained in Roberts and Rosenthal (J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 (1998) 255–268). In the transient case, we obtain formally that the optimal variance scales very differently in $n$ depending on the sign of a moment of the distribution, which vanishes at equilibrium. This suggest that it is difficult to derive practical recommendations for MALA from such asymptotic results.

Article information

Source
Bernoulli, Volume 20, Number 4 (2014), 1930-1978.

Dates
First available in Project Euclid: 19 September 2014

Permanent link to this document
https://projecteuclid.org/euclid.bj/1411134449

Digital Object Identifier
doi:10.3150/13-BEJ546

Mathematical Reviews number (MathSciNet)
MR3263094

Zentralblatt MATH identifier
1329.60261

Keywords
diffusion limits MALA optimal scaling propagation of chaos random walk Metropolis

Citation

Jourdain, Benjamin; Lelièvre, Tony; Miasojedow, Błażej. Optimal scaling for the transient phase of Metropolis Hastings algorithms: The longtime behavior. Bernoulli 20 (2014), no. 4, 1930--1978. doi:10.3150/13-BEJ546. https://projecteuclid.org/euclid.bj/1411134449


Export citation

References

  • [1] Andrieu, C. and Robert, C. (2001). Controlled MCMC for optimal sampling. Working Papers 2001-33, Centre de Recherche en Economie et Statistique. Available at http://ideas.repec.org/p/crs/wpaper/2001-33.html.
  • [2] Ané, C., Blachère, S., Chafaï, D., Fougères, P., Gentil, I., Malrieu, F., Roberto, C. and Scheffer, G. (2000). Sur les Inégalités de Sobolev Logarithmiques. Panoramas et Synthèses [Panoramas and Syntheses] 10. Paris: Société Mathématique de France. With a preface by Dominique Bakry and Michel Ledoux.
  • [3] Atchadé, Y.F. and Rosenthal, J.S. (2005). On adaptive Markov chain Monte Carlo algorithms. Bernoulli 11 815–828.
  • [4] Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17 1222–1244.
  • [5] Bédard, M. (2008). Optimal acceptance rates for Metropolis algorithms: Moving beyond 0.234. Stochastic Process. Appl. 118 2198–2222.
  • [6] Bédard, M., Douc, R. and Moulines, E. (2014). Scaling analysis of delayed rejection MCMC methods. Methodol. Comput. Appl. Probab. To appear. Published online: 6 March 2013.
  • [7] Bédard, M., Douc, R. and Moulines, E. (2012). Scaling analysis of multiple-try MCMC methods. Stochastic Process. Appl. 122 758–786.
  • [8] Beskos, A., Pillai, N., Roberts, G., Sanz-Serna, J.M. and Stuart, A. (2013). Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli 19 1501–1534.
  • [9] Beskos, A., Roberts, G. and Stuart, A. (2009). Optimal scalings for local Metropolis–Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Probab. 19 863–898.
  • [10] Breyer, L.A., Piccioni, M. and Scarlatti, S. (2004). Optimal scaling of MaLa for nonlinear regression. Ann. Appl. Probab. 14 1479–1505.
  • [11] Breyer, L.A. and Roberts, G.O. (2000). From Metropolis to diffusions: Gibbs states and optimal scaling. Stochastic Process. Appl. 90 181–206.
  • [12] Christensen, O.F., Roberts, G.O. and Rosenthal, J.S. (2005). Scaling limits for the transient phase of local Metropolis–Hastings algorithms. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 253–268.
  • [13] Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97–109.
  • [14] Jourdain, B., Lelièvre, T. and Miasojedow, B. (2012). Optimal scaling for the transient phase of the random walk Metropolis algorithm: The mean-field limit. Preprint. Available at http://fr.arxiv.org/abs/1210.7639.
  • [15] Mattingly, J.C., Pillai, N.S. and Stuart, A.M. (2012). Diffusion limits of the random walk Metropolis algorithm in high dimensions. Ann. Appl. Probab. 22 881–930.
  • [16] Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1092.
  • [17] Neal, P. and Roberts, G. (2011). Optimal scaling of random walk Metropolis algorithms with non-Gaussian proposals. Methodol. Comput. Appl. Probab. 13 583–601.
  • [18] Neal, P., Roberts, G. and Yuen, W.K. (2012). Optimal scaling of random walk Metropolis algorithms with discontinuous target densities. Ann. Appl. Probab. 22 1880–1927.
  • [19] Otto, F. and Villani, C. (2000). Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality. J. Funct. Anal. 173 361–400.
  • [20] Pillai, N., Stuart, A. and Thiéry, A. (2011). Optimal proposal design for random walk type Metropolis algorithms with Gaussian random field priors. Preprint. Available at http://arxiv.org/abs/1108.1494.
  • [21] Pillai, N.S., Stuart, A.M. and Thiéry, A.H. (2012). Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions. Ann. Appl. Probab. 22 2320–2356.
  • [22] Roberts, G.O., Gelman, A. and Gilks, W.R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110–120.
  • [23] Roberts, G.O. and Rosenthal, J.S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255–268.
  • [24] Roberts, G.O. and Rosenthal, J.S. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16 351–367.
  • [25] Sznitman, A.S. (1991). Topics in propagation of chaos. In École D’Été de Probabilités de Saint-Flour XIX—1989. Lecture Notes in Math. 1464 165–251. Berlin: Springer.