Source: Ann. Appl. Probab. Volume 19, Number 2
(2009), 617-640.
We give conditions under which a Markov chain constructed via parallel or simulated tempering is guaranteed to be rapidly mixing, which are applicable to a wide range of multimodal distributions arising in Bayesian statistical inference and statistical mechanics. We provide lower bounds on the spectral gaps of parallel and simulated tempering. These bounds imply a single set of sufficient conditions for rapid mixing of both techniques. A direct consequence of our results is rapid mixing of parallel and simulated tempering for several normal mixture models, and for the mean-field Ising model.
References
[1] Bhatnagar, N. and Randall, D. (2004). Torpid mixing of simulated tempering on the Potts model. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms 478–487 (electronic). ACM, New York.
[2] Binder, K. and Heermann, D. W. (2002). Monte Carlo Simulation in Statistical Physics: An Introduction, 4th ed. Springer Series in Solid-State Sciences 80. Springer, Berlin.
[3] Caracciolo, S., Pelissetto, A. and Sokal, A. D. (1992). Two remarks on simulated tempering. Unpublished manuscript.
[4] Diaconis, P. and Saloff-Coste, L. (1993). Comparison theorems for reversible Markov chains. Ann. Appl. Probab. 3 696–730.
[5] Diaconis, P. and Saloff-Coste, L. (1996). Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Probab. 6 695–750.
[6] Diaconis, P. and Stroock, D. (1991). Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. 1 36–61.
[7] Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In Computing Science and Statistics, Vol. 23: Proceedings of the 23rd Symposium on the Interface (E. Keramidas, ed.) 156–163. Interface Foundation of North America, Fairfax Station, VA.
[8] Geyer, C. J. and Thompson, E. A. (1995). Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Amer. Statist. Assoc. 90 909–920.
[9] Gilks, W. R., Richardson, S. and Spiegelhalter, D. J., eds. (1996). Markov Chain Monte Carlo in Practice: Interdisciplinary Statistics. Chapman & Hall, London.
[10] Jerrum, M. and Sinclair, A. (1996). The Markov Chain Monte Carlo Method: An Approach to Approximate Counting and Integration. PWS Publishing, Boston.
[11] Kannan, R. and Li, G. (1996). Sampling according to the multivariate normal density. In 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996) 204–212. IEEE Comput. Soc., Los Alamitos, CA.
[12] Lawler, G. F. and Sokal, A. D. (1988). Bounds on the L2 spectrum for Markov chains and Markov processes: A generalization of Cheeger’s inequality. Trans. Amer. Math. Soc. 309 557–580.
Mathematical Reviews (MathSciNet):
MR930082
[13] Madras, N. and Piccioni, M. (1999). Importance sampling for families of distributions. Ann. Appl. Probab. 9 1202–1225.
[14] Madras, N. and Randall, D. (2002). Markov chain decomposition for convergence rate analysis. Ann. Appl. Probab. 12 581–606.
[15] Madras, N. and Slade, G. (1993). The Self-Avoiding Walk. Birkhäuser, Boston.
[16] Madras, N. and Zheng, Z. (2003). On the swapping algorithm. Random Structures Algorithms 22 66–97.
[17] Marinari, E. and Parisi, G. (1992). Simulated tempering: A new Monte Carlo scheme. Europhys. Lett. EPL 19 451–458.
[18] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. and Teller, E. (1953). Equation of state calculations by fast computing machines. J. Chem. Phys. 21 1087–1092.
[19] Predescu, C., Predescu, M. and Ciobanu, C. V. (2004). The incomplete beta function law for parallel tempering sampling of classical canonical systems. J. Chem. Phys. 120 4119–4128.
[20] Robert, C. P. and Casella, G. (1999). Monte Carlo Statistical Methods. Springer, New York.
[21] Roberts, G. O. and Rosenthal, J. S. (1997). Geometric ergodicity and hybrid Markov chains. Electron. Comm. Probab. 2 13–25 (electronic).
[22] Roberts, G. O. and Rosenthal, J. S. (2004). General state space Markov chains and MCMC algorithms. Probab. Surv. 1 20–71 (electronic).
[23] Roberts, G. O. and Tweedie, R. L. (2001). Geometric L2 and L1 convergence are equivalent for reversible Markov chains. J. Appl. Probab. 38A 37–41.
[24] Rosenthal, J. S. (1995). Rates of convergence for Gibbs sampling for variance components models. Ann. Statist. 23 740–761.
[25] Sinclair, A. (1992). Improved bounds for mixing rates of Markov chains and multicommodity flow. Combin. Probab. Comput. 1 351–370.
[26] Tierney, L. (1994). Markov chains for exploring posterior distributions. Ann. Statist. 22 1701–1762. With discussion and a rejoinder by the author.
[27] Woodard, D. B. (2007). Conditions for rapid and torpid mixing of parallel and simulated tempering on multimodal distributions. Ph.D. dissertation, Duke Univ.
[28] Woodard, D. B., Schmidler, S. C. and Huber, M. (2008). Sufficient conditions for torpid mixing of parallel and simulated tempering. Stochastic Process. Appl. Submitted.
[29] Zheng, Z. (2003). On swapping and simulated tempering algorithms. Stochastic Process. Appl. 104 131–154.