Open Access
June 2023 Distributed Computation for Marginal Likelihood based Model Choice
Alexander Buchholz, Daniel Ahfock, Sylvia Richardson
Author Affiliations +
Bayesian Anal. 18(2): 607-638 (June 2023). DOI: 10.1214/22-BA1321

Abstract

We propose a general method for distributed Bayesian model choice, using the marginal likelihood, where a data set is split in non-overlapping subsets. These subsets are only accessed locally by individual workers and no data is shared between the workers. We approximate the model evidence for the full data set through Monte Carlo sampling from the posterior on every subset generating a model evidence per subset. The results are combined using a novel approach which corrects for the splitting using summary statistics of the generated samples. Our divide-and-conquer approach enables Bayesian model choice in the large data setting, exploiting all available information but limiting communication between workers. We derive theoretical error bounds that quantify the resulting trade-off between computational gain and loss in precision. The embarrassingly parallel nature yields important speed-ups when used on massive data sets as illustrated by our real world experiments. In addition, we show how the suggested approach can be extended to model choice within a reversible jump setting that explores multiple feature combinations within one run.

Funding Statement

This work was supported by the EPSRC (EP/R018561/1), an MRC programme grant (MC_UU_00002/10) and the Alan Turing Institute (TU/B/00092).

Acknowledgments

We thank Leonardo Bottolo, Paul Newcombe, Will Astle and Nicolas Chopin for helpful discussions and Will Astle for providing data. We would like to thank the reviewers and the editor for their feedback that greatly helped to improve our work.

Citation

Download Citation

Alexander Buchholz. Daniel Ahfock. Sylvia Richardson. "Distributed Computation for Marginal Likelihood based Model Choice." Bayesian Anal. 18 (2) 607 - 638, June 2023. https://doi.org/10.1214/22-BA1321

Information

Published: June 2023
First available in Project Euclid: 18 July 2022

MathSciNet: MR4578066
Digital Object Identifier: 10.1214/22-BA1321

Subjects:
Primary: 62F15
Secondary: 68W15

Keywords: Distributed computation , marginal likelihood

Vol.18 • No. 2 • June 2023
Back to Top