Markov chain Monte Carlo is a key computational tool in Bayesian statistics, but it can be challenging to monitor the convergence of an iterative stochastic algorithm. In this paper we show that the convergence diagnostic of Gelman and Rubin (1992) has serious flaws. Traditional will fail to correctly diagnose convergence failures when the chain has a heavy tail or when the variance varies across the chains. In this paper we propose an alternative rank-based diagnostic that fixes these problems. We also introduce a collection of quantile-based local efficiency measures, along with a practical approach for computing Monte Carlo error estimates for quantiles. We suggest that common trace plots should be replaced with rank plots from multiple chains. Finally, we give recommendations for how these methods should be used in practice.
We thank Ben Bales, Ian Langmore, the editor, and anonymous reviewers for useful comments. We also thank Academy of Finland, the U.S. Office of Naval Research, National Science Foundation, Institute for Education Sciences, the Natural Science and Engineering Research Council of Canada, Finnish Center for Artificial Intelligence, and Technology Industries of Finland Centennial Foundation for partial support of this research. All computer code and an even larger variety of numerical experiments are available in the online appendix at https://avehtari.github.io/rhat_ess/rhat_ess.html.
A previous version of this manuscript contained a slight omission in the paragraph under equation (3.3) and one typo in equation (4.1). More specifically, under equation (3.3) “assuming the starting distribution of the simulations is appropriately overdispersed” has been changed to “assuming the starting distributions and all intermediate distributions of the simulations are appropriately overdispersed”; in equation (4.1), the denominator was initially written as “S - 1/4” and it has now been corrected to be “S + 1/4”. The article was corrected on 22 June 2021.
Aki Vehtari. Andrew Gelman. Daniel Simpson. Bob Carpenter. Paul-Christian Bürkner. "Rank-Normalization, Folding, and Localization: An Improved for Assessing Convergence of MCMC (with Discussion)." Bayesian Anal. 16 (2) 667 - 718, June 2021. https://doi.org/10.1214/20-BA1221