Brazilian Journal of Probability and Statistics
- Braz. J. Probab. Stat.
- Volume 31, Number 4 (2017), 668-685.
Comparing consensus Monte Carlo strategies for distributed Bayesian computation
Consensus Monte Carlo is an algorithm for conducting Monte Carlo based Bayesian inference on large data sets distributed across many worker machines in a data center. The algorithm operates by running a separate Monte Carlo algorithm on each worker machine, which only sees a portion of the full data set. The worker-level posterior samples are then combined to form a Monte Carlo approximation to the full posterior distribution based on the complete data set. We compare several methods of carrying out the combination, including a new method based on approximating worker-level simulations using a mixture of multivariate Gaussian distributions. We find that resampling and kernel density based methods break down after 10 or sometimes fewer dimensions, while the new mixture-based approach works well, but the necessary mixture models take too long to fit.
Braz. J. Probab. Stat., Volume 31, Number 4 (2017), 668-685.
Received: December 2016
Accepted: April 2017
First available in Project Euclid: 15 December 2017
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Scott, Steven L. Comparing consensus Monte Carlo strategies for distributed Bayesian computation. Braz. J. Probab. Stat. 31 (2017), no. 4, 668--685. doi:10.1214/17-BJPS365. https://projecteuclid.org/euclid.bjps/1513328760
- Comment: Consensus Monte Carlo using expectation propagation.
- Comment: A brief survey of the current state of play for Bayesian computation in data science at big-data scale.