The Annals of Statistics

Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression

Qian Qin and James P. Hobert

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

The use of MCMC algorithms in high dimensional Bayesian problems has become routine. This has spurred so-called convergence complexity analysis, the goal of which is to ascertain how the convergence rate of a Monte Carlo Markov chain scales with sample size, $n$, and/or number of covariates, $p$. This article provides a thorough convergence complexity analysis of Albert and Chib’s [J. Amer. Statist. Assoc. 88 (1993) 669–679] data augmentation algorithm for the Bayesian probit regression model. The main tools used in this analysis are drift and minorization conditions. The usual pitfalls associated with this type of analysis are avoided by utilizing centered drift functions, which are minimized in high posterior probability regions, and by using a new technique to suppress high-dimensionality in the construction of minorization conditions. The main result is that the geometric convergence rate of the underlying Markov chain is bounded below 1 both as $n\rightarrow\infty$ (with $p$ fixed), and as $p\rightarrow\infty$ (with $n$ fixed). Furthermore, the first computable bounds on the total variation distance to stationarity are byproducts of the asymptotic analysis.

Article information

Source
Ann. Statist., Volume 47, Number 4 (2019), 2320-2347.

Dates
Received: December 2017
Revised: April 2018
First available in Project Euclid: 21 May 2019

Permanent link to this document
https://projecteuclid.org/euclid.aos/1558425647

Digital Object Identifier
doi:10.1214/18-AOS1749

Mathematical Reviews number (MathSciNet)
MR3953453

Zentralblatt MATH identifier
07082288

Subjects
Primary: 60J05: Discrete-time Markov processes on general state spaces
Secondary: 65C05: Monte Carlo methods

Keywords
Drift condition geometric ergodicity high dimensional inference large $p$-small $n$ Markov chain Monte Carlo minorization condition

Citation

Qin, Qian; Hobert, James P. Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression. Ann. Statist. 47 (2019), no. 4, 2320--2347. doi:10.1214/18-AOS1749. https://projecteuclid.org/euclid.aos/1558425647


Export citation

References

  • Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669–679.
  • Baragatti, M. and Pommeret, D. (2012). A study of variable selection using $g$-prior distribution with ridge parameter. Comput. Statist. Data Anal. 56 1920–1934.
  • Baxendale, P. H. (2005). Renewal theory and computable convergence rates for geometrically ergodic Markov chains. Ann. Appl. Probab. 15 700–738.
  • Chakraborty, S. and Khare, K. (2017). Convergence properties of Gibbs samplers for Bayesian probit regression with proper priors. Electron. J. Stat. 11 177–210.
  • Chen, M.-H. and Shao, Q.-M. (2001). Propriety of posterior distribution for dichotomous quantal response models. Proc. Amer. Math. Soc. 129 293–302.
  • Diaconis, P., Khare, K. and Saloff-Coste, L. (2008). Gibbs sampling, exponential families and orthogonal polynomials (with discussion). Statist. Sci. 23 151–178.
  • Durmus, A. and Moulines, E. (2016). High-dimensional Bayesian inference via the unadjusted Langevin algorithm. arXiv:1605.01559.
  • Flegal, J. M., Haran, M. and Jones, G. L. (2008). Markov chain Monte Carlo: Can we trust the third significant figure? Statist. Sci. 23 250–260.
  • Fort, G., Moulines, E., Roberts, G. O. and Rosenthal, J. S. (2003). On the geometric ergodicity of hybrid samplers. J. Appl. Probab. 40 123–146.
  • Gupta, M. and Ibrahim, J. G. (2007). Variable selection in regression mixture modeling for the discovery of gene regulatory networks. J. Amer. Statist. Assoc. 102 867–880.
  • Hairer, M. and Mattingly, J. C. (2011). Yet another look at Harris’ ergodic theorem for Markov chains. In Seminar on Stochastic Analysis, Random Fields and Applications VI. Progress in Probability 63 109–117. Birkhäuser/Springer Basel AG, Basel.
  • Johndrow, J. E., Smith, A., Pillai, N. and Dunson, D. B. (2018). MCMC for imbalanced categorical data. J. Amer. Statist. Assoc. To appear. Available at arXiv:1605.05798.
  • Jones, G. L. (2001). Convergence Rates and Monte Carlo Standard Errors for Markov Chain Monte Carlo Algorithms. ProQuest LLC, Ann Arbor, MI. Ph.D. thesis, Univ. Florida.
  • Jones, G. L. and Hobert, J. P. (2001). Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Statist. Sci. 16 312–334.
  • Łatuszyński, K., Miasojedow, B. and Niemiro, W. (2013). Nonasymptotic bounds on the estimation error of MCMC algorithms. Bernoulli 19 2033–2066.
  • Marchev, D. and Hobert, J. P. (2004). Geometric ergodicity of van Dyk and Meng’s algorithm for the multivariate Student’s $t$ model. J. Amer. Statist. Assoc. 99 228–238.
  • Meyn, S. P. and Tweedie, R. L. (1994). Computable bounds for geometric convergence rates of Markov chains. Ann. Appl. Probab. 4 981–1011.
  • Meyn, S. and Tweedie, R. L. (2009). Markov Chains and Stochastic Stability, 2nd ed. Cambridge Univ. Press, Cambridge.
  • Qin, Q. and Hobert, J. P. (2018). Supplement to “Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression.” DOI:10.1214/18-AOS1749SUPP.
  • Rajaratnam, B. and Sparks, D. (2015). MCMC-based inference in the era of big data: A fundamental analysis of the convergence complexity of high-dimensional chains. arXiv:1508.00947.
  • Roberts, G. O. and Rosenthal, J. S. (1997). Geometric ergodicity and hybrid Markov chains. Electron. Commun. Probab. 2 13–25.
  • Roberts, G. O. and Rosenthal, J. S. (1998). Markov-chain Monte Carlo: Some practical implications of theoretical results (with discussion). Canad. J. Statist. 26 5–31.
  • Roberts, G. O. and Rosenthal, J. S. (2001). Markov chains and de-initializing processes. Scand. J. Stat. 28 489–504.
  • Roberts, G. O. and Tweedie, R. L. (1999). Bounds on regeneration times and convergence rates for Markov chains. Stochastic Process. Appl. 80 211–229.
  • Roberts, G. O. and Tweedie, R. L. (2001). Geometric $L^{2}$ and $L^{1}$ convergence are equivalent for reversible Markov chains. J. Appl. Probab. 38A 37–41.
  • Rosenthal, J. S. (1995). Minorization conditions and convergence rates for Markov chain Monte Carlo. J. Amer. Statist. Assoc. 90 558–566.
  • Roy, V. and Hobert, J. P. (2007). Convergence rates and asymptotic standard errors for Markov chain Monte Carlo algorithms for Bayesian probit regression. J. R. Stat. Soc. Ser. B. Stat. Methodol. 69 607–623.
  • Roy, V. and Hobert, J. P. (2010). On Monte Carlo methods for Bayesian multivariate regression models with heavy-tailed errors. J. Multivariate Anal. 101 1190–1202.
  • Rudin, W. (1976). Principles of Mathematical Analysis, 3rd ed. McGraw-Hill Book Co., New York-Auckland-Düsseldorf.
  • Sinclair, A. and Jerrum, M. (1989). Approximate counting, uniform generation and rapidly mixing Markov chains. Inform. and Comput. 82 93–133.
  • Vats, D. (2017). Geometric ergodicity of Gibbs samplers in Bayesian penalized regression models. Electron. J. Stat. 11 4033–4064.
  • Yang, J. and Rosenthal, J. S. (2017). Complexity results for MCMC derived from quantitative bounds. arXiv:1708.00829.
  • Yang, A.-J. and Song, X.-Y. (2009). Bayesian variable selection for disease classification using gene expression data. Bioinformatics 26 215–222.
  • Yang, Y., Wainwright, M. J. and Jordan, M. I. (2016). On the computational complexity of high-dimensional Bayesian variable selection. Ann. Statist. 44 2497–2532.

Supplemental materials

  • Supplementary material for “Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression”. Section 6 provides some basic results on Hermitian matrices and truncated normal distributions. Section 7 gives some technical results, and the proofs for Corollary 5, Proposition 13, Proposition 16, Proposition 20 and Proposition 23.