Open Access
June 2019 The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data
Joris Bierkens, Paul Fearnhead, Gareth Roberts
Ann. Statist. 47(3): 1288-1320 (June 2019). DOI: 10.1214/18-AOS1715


Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multidimensional version of the Zig-Zag process of [Ann. Appl. Probab. 27 (2017) 846–882], a continuous-time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible nonreversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, that is, the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial preprocessing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.


Download Citation

Joris Bierkens. Paul Fearnhead. Gareth Roberts. "The Zig-Zag process and super-efficient sampling for Bayesian analysis of big data." Ann. Statist. 47 (3) 1288 - 1320, June 2019.


Received: 1 July 2016; Revised: 1 March 2018; Published: June 2019
First available in Project Euclid: 13 February 2019

zbMATH: 07053509
MathSciNet: MR3911113
Digital Object Identifier: 10.1214/18-AOS1715

Primary: 65C60
Secondary: 60J25 , 62F15 , 65C05

Keywords: exact sampling , MCMC , nonreversible Markov process , Piecewise deterministic Markov process , stochastic gradient Langevin dynamics , sub-sampling

Rights: Copyright © 2019 Institute of Mathematical Statistics

Vol.47 • No. 3 • June 2019
Back to Top