Bayesian Analysis

Automated Parameter Blocking for Efficient Markov Chain Monte Carlo Sampling

Daniel Turek, Perry de Valpine, Christopher J. Paciorek, and Clifford Anderson-Bergman

Full-text: Open access

Abstract

Markov chain Monte Carlo (MCMC) sampling is an important and commonly used tool for the analysis of hierarchical models. Nevertheless, practitioners generally have two options for MCMC: utilize existing software that generates a black-box “one size fits all" algorithm, or the challenging (and time consuming) task of implementing a problem-specific MCMC algorithm. Either choice may result in inefficient sampling, and hence researchers have become accustomed to MCMC runtimes on the order of days (or longer) for large models. We propose an automated procedure to determine an efficient MCMC block-sampling algorithm for a given model and computing platform. Our procedure dynamically determines blocks of parameters for joint sampling that result in efficient MCMC sampling of the entire model. We test this procedure using a diverse suite of example models, and observe non-trivial improvements in MCMC efficiency for many models. Our procedure is the first attempt at such, and may be generalized to a broader space of MCMC algorithms. Our results suggest that substantive improvements in MCMC efficiency may be practically realized using our automated blocking procedure, or variants thereof, which warrants additional study and application.

Article information

Source
Bayesian Anal., Volume 12, Number 2 (2017), 465-490.

Dates
First available in Project Euclid: 26 May 2016

Permanent link to this document
https://projecteuclid.org/euclid.ba/1464266500

Digital Object Identifier
doi:10.1214/16-BA1008

Mathematical Reviews number (MathSciNet)
MR3620741

Zentralblatt MATH identifier
1384.62022

Keywords
MCMC Metropolis–Hastings block sampling integrated autocorrelation time mixing NIMBLE

Rights
Creative Commons Attribution 4.0 International License.

Citation

Turek, Daniel; de Valpine, Perry; Paciorek, Christopher J.; Anderson-Bergman, Clifford. Automated Parameter Blocking for Efficient Markov Chain Monte Carlo Sampling. Bayesian Anal. 12 (2017), no. 2, 465--490. doi:10.1214/16-BA1008. https://projecteuclid.org/euclid.ba/1464266500


Export citation

References

  • Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2003). Hierarchical Modeling and Analysis for Spatial Data. CRC Press.
  • Breslow, N. E. and Clayton, D. G. (1993). “Approximate Inference in Generalized Linear Mixed Models.” Journal of the American Statistical Association, 88(421): 9–25.
  • Caffo, B. S., Jank, W., and Jones, G. L. (2005). “Ascent-Based Monte Carlo Expectation-Maximization.” Journal of the Royal Statistical Society. Series B (Statistical Methodology), 67(2): 235–251.
  • Cai, T. and Liu, W. (2011). “Adaptive Thresholding for Sparse Covariance Matrix Estimation.” Journal of the American Statistical Association, 106(494): 672–684.
  • Duane, S., Kennedy, A. D., Pendleton, B. J., and Roweth, D. (1987). “Hybrid Monte Carlo.” Physics Letters B, 195(2): 216–222.
  • Durbin, J. and Koopman, S. J. (2012). Time Series Analysis by State Space Methods: Second Edition. Oxford University Press.
  • Efron, B. and Tibshirani, R. J. (1994). An Introduction to the Bootstrap. CRC Press.
  • Everitt, B. (2011). Cluster Analysis. Wiley Series in Probability and Statistics. Wiley, 5th edition.
  • Gelman, A. and Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.
  • Gilks, W. R. (2005). “Markov Chain Monte Carlo.” In Encyclopedia of Biostatistics. John Wiley & Sons, Ltd.
  • Gneiting, T. and Raftery, A. E. (2007). “Strictly Proper Scoring Rules, Prediction, and Estimation.” Journal of the American Statistical Association, 102(477): 359–378.
  • Guennebaud, G. and Jacob, B. (2010). “Eigen.” http://eigen.tuxfamily.org.
  • Haario, H., Saksman, E., and Tamminen, J. (1993). “Adaptive proposal distribution for random walk Metropolis algorithm.” Computational Statistics, 3.
  • Harvey, A. C. (1993). Time Series Models. The MIT Press, 2nd edition.
  • Hastings, W. K. (1970). “Monte Carlo sampling methods using Markov chains and their applications.” Biometrika, 57(1): 97–109.
  • Hjort, N. L., Dahl, F. A., and Steinbakk, G. H. (2006). “Post-Processing Posterior Predictive p Values.” Journal of the American Statistical Association, 101(475): 1157–1174.
  • Lele, S. R., Dennis, B., and Lutscher, F. (2007). “Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods.” Ecology Letters, 10(7): 551–563.
  • Liu, J. S., Wong, W. H., and Kong, A. (1994). “Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes.” Biometrika, 81(1): 27–40.
  • Lunn, D., Jackson, C., Best, N., Thomas, A., and Spiegelhalter, D. (2012). The BUGS Book: A Practical Introduction to Bayesian Analysis. CRC Press.
  • Lunn, D. J., Thomas, A., Best, N., and Spiegelhalter, D. (2000). “WinBUGS – A Bayesian modelling framework: Concepts, structure, and extensibility.” Statistics and Computing, 10(4): 325–337.
  • Marshall, T. and Roberts, G. (2012). “An adaptive approach to Langevin MCMC.” Statistics and Computing, 22(5): 1041–1057.
  • Mengersen, K. L. and Tweedie, R. L. (1996). “Rates of convergence of the Hastings and Metropolis algorithms.” The Annals of Statistics, 24(1): 101–121.
  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953). “Equation of State Calculations by Fast Computing Machines.” The Journal of Chemical Physics, 21(6): 1087–1092.
  • Murtagh, F. and Legendre, P. (2014). “Ward’s hierarchical agglomerative clustering method: which algorithms implement Ward’s criterion?” Journal of Classification, 31(3): 274–295.
  • Neal, R. (2011). “MCMC using Hamiltonian dynamics.” Handbook of Markov Chain Monte Carlo, vol. 2.
  • Neal, R. M. (2003). “Slice Sampling.” The Annals of Statistics, 31(3): 705–741.
  • NIMBLE Development Team,. (2014). “NIMBLE: An R Package for Programming with BUGS models, Version 0.1.” http://r-nimble.org.
  • Plummer, M. (2011). “JAGS Version 3.1. 0 user manual.” International Agency for Research on Cancer.
  • R Core Team. (2014). “R: A Language and Environment for Statistical Computing.”
  • Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods, vol. 319. Citeseer.
  • Roberts, G. O., Gelman, A., and Gilks, W. R. (1997). “Weak convergence and optimal scaling of random walk Metropolis algorithms.” The Annals of Applied Probability, 7(1): 110–120.
  • Roberts, G. O. and Rosenthal, J. S. (2001). “Optimal scaling for various Metropolis–Hastings algorithms.” Statistical Science, 16(4): 351–367.
  • Roberts, G. O. and Rosenthal, J. S. (2009). “Examples of Adaptive MCMC.” Journal of Computational and Graphical Statistics, 18(2): 349–367.
  • Roberts, G. O. and Sahu, S. K. (1997). “Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(2): 291–317.
  • Roberts, G. O. and Tweedie, R. L. (1996). “Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms.” Biometrika, 83(1): 95–110.
  • Rue, H. and Held, L. (2005). Gaussian Markov Random Fields: Theory and Applications. CRC Press.
  • Sargent, D. J., Hodges, J. S., and Carlin, B. P. (2000). “Structured Markov Chain Monte Carlo.” Journal of Computational and Graphical Statistics, 9(2): 217–234.
  • Shaby, B. and Wells, M. (2011). “Exploring an adaptive Metropolis algorithm.” Department of Statistics, Duke University.
  • Skaug, H. J. and Fournier, D. A. (2006). “Automatic approximation of the marginal likelihood in non-Gaussian hierarchical models.” Computational Statistics & Data Analysis, 51(2): 699–709.
  • Stan Development Team (2014). “Stan: A C++ Library for Probability and Sampling, Version 2.5.0.” http://mc-stan.org/.
  • Straatsma, T., Berendsen, H., and Stam, A. (1986). “Estimation of statistical errors in molecular simulation calculations.” Molecular Physics, 57(1): 89–95.
  • Thompson, M. B. (2010). “A Comparison of Methods for Computing Autocorrelation Time.” arXiv:1011.0175.
  • Trefethen, L. N. and Bau, D. (1997). Numerical Linear Algebra. SIAM.
  • Turek, D., de Valpine, P., Paciorek, C. J., and Anderson-Bergman, C. (2016). “Automated Blocking Posterior Density Plots.” Bayesian Analysis.
  • Waller, L. A. and Zelterman, D. (1997). “Log-Linear Modeling with the Negative Multinomial Distribution.” Biometrics, 53(3): 971–982.
  • Xu, R. and Wunsch, I. D. (2005). “Survey of clustering algorithms.” IEEE Transactions on Neural Networks, 16(3): 645–678.
  • Zipunnikov, V. V. and Booth, J. G. (2006). “Monte Carlo EM for generalized linear mixed models using randomized spherical radial integration.”

Supplemental materials