## Electronic Journal of Probability

### Sufficient Conditions for Torpid Mixing of Parallel and Simulated Tempering

#### Abstract

We obtain upper bounds on the spectral gap of Markov chains constructed by parallel and simulated tempering, and provide a set of sufficient conditions for torpid mixing of both techniques. Combined with the results of Woodard, Schmidler and Huber (2009), these results yield a two-sided bound on the spectral gap of these algorithms. We identify a persistence property of the target distribution, and show that it can lead unexpectedly to slow mixing that commonly used convergence diagnostics will fail to detect. For a multimodal distribution, the persistence is a measure of how spiky'', or tall and narrow, one peak is relative to the other peaks of the distribution. We show that this persistence phenomenon can be used to explain the torpid mixing of parallel and simulated tempering on the ferromagnetic mean-field Potts model shown previously. We also illustrate how it causes torpid mixing of tempering on a mixture of normal distributions with unequal covariances in R^M, a previously unknown result with relevance to statistical inference problems. More generally, anytime a multimodal distribution includes both very narrow and very wide peaks of comparable probability mass, parallel and simulated tempering are shown to mix slowly.

#### Article information

Source
Electron. J. Probab., Volume 14 (2009), paper no. 29, 780-804.

Dates
Accepted: 31 March 2009
First available in Project Euclid: 1 June 2016

https://projecteuclid.org/euclid.ejp/1464819490

Digital Object Identifier
doi:10.1214/EJP.v14-638

Mathematical Reviews number (MathSciNet)
MR2495560

Zentralblatt MATH identifier
1189.65021

Subjects
Primary: 65C40: Computational Markov chains

Rights

#### Citation

Woodard, Dawn; Schmidler, Scott; Huber, Mark. Sufficient Conditions for Torpid Mixing of Parallel and Simulated Tempering. Electron. J. Probab. 14 (2009), paper no. 29, 780--804. doi:10.1214/EJP.v14-638. https://projecteuclid.org/euclid.ejp/1464819490

#### References

• Banerjee, Sudipto; Carlin, Brad P.; Gelfand, Alan E. Hierarchical Modeling and Analysis for Spatial Data, Chapman and Hall, Boca Raton, FL, 2004.
• Bhatnagar, Nayantara; Randall, Dana. Torpid mixing of simulated tempering on the Potts model. Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 478–487 (electronic), ACM, New York, 2004.
• Geman, S.; Geman, D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6 (1984), 721–741.
• Geyer, C. J. Markov chain Monte Carlo maximum likelihood. In Computing Science and Statistics, Volume 23: Proceedings of the 23rd Symposium on the Interface, E. Keramidas, Ed., 156–163, Interface Foundation of North America, Fairfax Station, VA, 1991.
• Geyer, C. J.; Thompson, E. A. Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Amer. Statist. Assoc. 90 (1995), 909–920.
• Gore, Vivek K.; Jerrum, Mark R. The Swendsen-Wang process does not always mix rapidly. J. Statist. Phys. 97 (1999), no. 1-2, 67–86.
• Green, Peter J.; Richardson, Sylvia. Hidden Markov models and disease mapping. J. Amer. Statist. Assoc. 97 (2002), no. 460, 1055–1070.
• Kannan, Ravi; Li, Guangxing. Sampling according to the multivariate normal density. 37th Annual Symposium on Foundations of Computer Science (Burlington, VT, 1996), 204–212, IEEE Comput. Soc. Press, Los Alamitos, CA, 1996.
• Lawler, Gregory F.; Sokal, Alan D. Bounds on the $Lsp 2$ spectrum for Markov chains and Markov processes: a generalization of Cheeger's inequality. Trans. Amer. Math. Soc. 309 (1988), no. 2, 557–580.
• Madras, Neal; Randall, Dana. Markov chain decomposition for convergence rate analysis. Ann. Appl. Probab. 12 (2002), no. 2, 581–606.
• Madras, Neal; Slade, Gordon. The self-avoiding walk. Probability and its Applications. Birkhäuser Boston, Inc., Boston, MA, 1993. xiv+425 pp. ISBN: 0-8176-3589-0
• Madras, Neal; Zheng, Zhongrong. On the swapping algorithm. Random Structures Algorithms 22 (2003), no. 1, 66–97.
• Marinari, E. ; Parisi, G. Simulated tempering: a new Monte Carlo scheme. Europhysics Letters 19 (1992), 451–458.
• Matthews, Peter. A slowly mixing Markov chain with implications for Gibbs sampling. Statist. Probab. Lett. 17 (1993), no. 3, 231–236.
• Predescu, C. ; Predescu, M. ; Ciobanu, C. V. The incomplete beta function law for parallel tempering sampling of classical canonical systems. J. Chem. Phys. 120 (2004), 4119–4128.
• Roberts, Gareth O.; Rosenthal, Jeffrey S. General state space Markov chains and MCMC algorithms. Probab. Surv. 1 (2004), 20–71 (electronic).
• Roberts, G. O.; Tweedie, R. L. Rates of convergence of stochastically monotone and continuous time Markov models. J. Appl. Probab. 37 (2000), no. 2, 359–373.
• Schmidler, S. C.; Woodard, D. B. Computational complexity and Bayesian analysis, In preparation.
• Tierney, Luke. Markov chains for exploring posterior distributions. With discussion and a rejoinder by the author. Ann. Statist. 22 (1994), no. 4, 1701–1762.
• Woodard, D. B. Conditions for rapid and torpid mixing of parallel and simulated tempering on multimodal distributions. Ph.D. thesis, Duke University, 2007.
• Woodard, D. B. Detecting poor convergence of posterior samplers due to multimodality. Discussion Paper 2008-05, Duke University, Dept. of Statistical Science, 2008.
• Woodard, D. B., Schmidler, S. C., Huber, M. Conditions for rapid mixing of parallel and simulated tempering on multimodal distributions. In press, Annals of Applied Probability, 2009.
• Yuen, W. K. Application of geometric bounds to convergence rates of Markov chains and Markov processes on R^n, Ph.D. thesis, University of Toronto, 2001.
• Zheng, Zhongrong. On swapping and simulated tempering algorithms. Stochastic Process. Appl. 104 (2003), no. 1, 131–154.