The Annals of Statistics

Convergence rate of Markov chain methods for genomic motif discovery

Dawn B. Woodard and Jeffrey S. Rosenthal

Full-text: Open access


We analyze the convergence rate of a simplified version of a popular Gibbs sampling method used for statistical discovery of gene regulatory binding motifs in DNA sequences. This sampler satisfies a very strong form of ergodicity (uniform). However, we show that, due to multimodality of the posterior distribution, the rate of convergence often decreases exponentially as a function of the length of the DNA sequence. Specifically, we show that this occurs whenever there is more than one true repeating pattern in the data. In practice there are typically multiple such patterns in biological data, the goal being to detect the most well-conserved and frequently-occurring of these. Our findings match empirical results, in which the motif-discovery Gibbs sampler has exhibited such poor convergence that it is used only for finding modes of the posterior distribution (candidate motifs) rather than for obtaining samples from that distribution. Ours are some of the first meaningful bounds on the convergence rate of a Markov chain method for sampling from a multimodal posterior distribution, as a function of statistical quantities like the number of observations.

Article information

Ann. Statist., Volume 41, Number 1 (2013), 91-124.

First available in Project Euclid: 5 March 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F15: Bayesian inference
Secondary: 60J10: Markov chains (discrete-time Markov processes on discrete state spaces)

Gibbs sampler DNA slow mixing spectral gap multimodal


Woodard, Dawn B.; Rosenthal, Jeffrey S. Convergence rate of Markov chain methods for genomic motif discovery. Ann. Statist. 41 (2013), no. 1, 91--124. doi:10.1214/12-AOS1075.

Export citation


  • Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods (with discussion). J. R. Stat. Soc. Ser. B Stat. Methodol. 72 269–342.
  • Belloni, A. and Chernozhukov, V. (2009). On the computational complexity of MCMC-based estimators in large samples. Ann. Statist. 37 2011–2055.
  • Berk, R. H. (1966). Limiting behavior of posterior distributions when the model is incorrect. Ann. Math. Statist. 37 51–58.
  • Bhatnagar, N. and Randall, D. (2004). Torpid mixing of simulated tempering on the Potts model. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms 478–487. ACM, New York.
  • Borgs, C., Chayes, J. T., Frieze, A., Kim, J. H., Tetali, P., Vigoda, E. and Vu, V. H. (1999). Torpid mixing of some MCMC algorithms in statistical physics. In Proceedings of the 40th IEEE Symposium on Foundations of Computer Science 218–229. IEEE, New York.
  • Del Moral, P., Doucet, A. and Jasra, A. (2006). Sequential Monte Carlo samplers. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 411–436.
  • Diaconis, P. and Saloff-Coste, L. (1993). Comparison theorems for reversible Markov chains. Ann. Appl. Probab. 3 696–730.
  • Diaconis, P. and Saloff-Coste, L. (1996). Logarithmic Sobolev inequalities for finite Markov chains. Ann. Appl. Probab. 6 695–750.
  • Diaconis, P. and Stroock, D. (1991). Geometric bounds for eigenvalues of Markov chains. Ann. Appl. Probab. 1 36–61.
  • Fort, G., Moulines, E., Roberts, G. O. and Rosenthal, J. S. (2003). On the geometric ergodicity of hybrid samplers. J. Appl. Probab. 40 123–146.
  • Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statist. Sci. 7 457–472.
  • Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern. Anal. Mach. Intell. 6 721–741.
  • Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bayesian Statistics, 4 (PeñíScola, 1991) (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 169–193. Oxford Univ. Press, New York.
  • Green, P. J. and Richardson, S. (2002). Hidden Markov models and disease mapping. J. Amer. Statist. Assoc. 97 1055–1070.
  • Hans, C., Dobra, A. and West, M. (2007). Shotgun stochastic search for “large $p$” regression. J. Amer. Statist. Assoc. 102 507–516.
  • Jarner, S. F. and Hansen, E. (2000). Geometric ergodicity of Metropolis algorithms. Stochastic Process. Appl. 85 341–361.
  • Jensen, S. T., Liu, X. S., Zhou, Q. and Liu, J. S. (2004). Computational discovery of gene regulatory binding motifs: A Bayesian perspective. Statist. Sci. 19 188–204.
  • Johnson, A. A. and Jones, G. L. (2010). Gibbs sampling for a Bayesian hierarchical general linear model. Electron. J. Stat. 4 313–333.
  • Jones, G. L. and Hobert, J. P. (2001). Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Statist. Sci. 16 312–334.
  • Jones, G. L. and Hobert, J. P. (2004). Sufficient burn-in for Gibbs samplers for a hierarchical random effects model. Ann. Statist. 32 784–817.
  • Kamatani, K. (2011). Weak consistency of Markov chain Monte Carlo methods. Technical report. Available at
  • Kellis, M., Patterson, N., Birren, B., Berger, B. and Lander, E. S. (2004). Methods in comparative genomics: Genome correspondence, gene identification and regulatory motif discovery. J. Comput. Biol. 11 319–355.
  • Kullback, S. (1959). Information Theory and Statistics. Wiley, New York.
  • Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F. and Wootton, J. C. (1993). Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science 262 208–214.
  • Liang, F. and Wong, W. H. (2000). Evolutionary Monte Carlo: Applications to $C_p$ model sampling and change point problem. Statist. Sinica 10 317–342.
  • Liu, J. S. (1994). The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. J. Amer. Statist. Assoc. 89 958–966.
  • Liu, X., Brutlag, D. L. and Liu, J. S. (2001). BioProspector: Discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pacific Symposium on Biocomputing 6 127–138.
  • Liu, J. S., Neuwald, A. F. and Lawrence, C. E. (1995). Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Amer. Statist. Assoc. 90 1156–1170.
  • Liu, J. S., Wong, W. H. and Kong, A. (1995). Covariance structure and convergence rate of the Gibbs sampler with various scans. J. Roy. Statist. Soc. Ser. B 57 157–169.
  • Madras, N. and Randall, D. (2002). Markov chain decomposition for convergence rate analysis. Ann. Appl. Probab. 12 581–606.
  • Madras, N. and Zheng, Z. (2003). On the swapping algorithm. Random Structures Algorithms 22 66–97.
  • Mira, A. (2001). Ordering and improving the performance of Monte Carlo Markov chains. Statist. Sci. 16 340–350.
  • Mossel, E. and Vigoda, E. (2006). Limitations of Markov chain Monte Carlo algorithms for Bayesian inference of phylogeny. Ann. Appl. Probab. 16 2215–2234.
  • Neuwald, A. F., Liu, J. S. and Lawrence, C. E. (1995). Gibbs motif sampling: Detection of bacterial outer membrane protein repeats. Protein Sci. 4 1618–1632.
  • Peskun, P. H. (1973). Optimum Monte-Carlo sampling using Markov chains. Biometrika 60 607–612.
  • Roberts, G. O. and Rosenthal, J. S. (2004). General state space Markov chains and MCMC algorithms. Probab. Surv. 1 20–71.
  • Roberts, G. O. and Sahu, S. K. (2001). Approximate predetermined convergence properties of the Gibbs sampler. J. Comput. Graph. Statist. 10 216–229.
  • Rosenthal, J. S. (1995). Minorization conditions and convergence rates for Markov chain Monte Carlo. J. Amer. Statist. Assoc. 90 558–566.
  • Rosenthal, J. S. (1996). Analysis of the Gibbs sampler for a model related to James–Stein estimators. Statist. Comput. 6 269–275.
  • Roth, F. P., Hughes, J. D., Estep, P. W. and Church, G. M. (1998). Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat. Biotechnol. 16 939–945.
  • Sinclair, A. (1992). Improved bounds for mixing rates of Markov chains and multicommodity flow. Combin. Probab. Comput. 1 351–370.
  • Tierney, L. (1998). A note on Metropolis–Hastings kernels for general state spaces. Ann. Appl. Probab. 8 1–9.
  • Woodard, D. B. and Rosenthal, J. S. (2013). Supplement to “Convergence rate of Markov chain methods for genomic motif discovery.” DOI:10.1214/12-AOS1075SUPP.
  • Woodard, D. B., Schmidler, S. C. and Huber, M. (2009a). Conditions for rapid mixing of parallel and simulated tempering on multimodal distributions. Ann. Appl. Probab. 19 617–640.
  • Woodard, D. B., Schmidler, S. C. and Huber, M. (2009b). Sufficient conditions for torpid mixing of parallel and simulated tempering. Electron. J. Probab. 14 780–804.

Supplemental materials