Statistical Science

Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo

Paul Fearnhead, Joris Bierkens, Murray Pollock, and Gareth O. Roberts

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Recently, there have been conceptually new developments in Monte Carlo methods through the introduction of new MCMC and sequential Monte Carlo (SMC) algorithms which are based on continuous-time, rather than discrete-time, Markov processes. This has led to some fundamentally new Monte Carlo algorithms which can be used to sample from, say, a posterior distribution. Interestingly, continuous-time algorithms seem particularly well suited to Bayesian analysis in big-data settings as they need only access a small sub-set of data points at each iteration, and yet are still guaranteed to target the true posterior distribution. Whilst continuous-time MCMC and SMC methods have been developed independently we show here that they are related by the fact that both involve simulating a piecewise deterministic Markov process. Furthermore, we show that the methods developed to date are just specific cases of a potentially much wider class of continuous-time Monte Carlo algorithms. We give an informal introduction to piecewise deterministic Markov processes, covering the aspects relevant to these new Monte Carlo algorithms, with a view to making the development of new continuous-time Monte Carlo more accessible. We focus on how and why sub-sampling ideas can be used with these algorithms, and aim to give insight into how these new algorithms can be implemented, and what are some of the issues that affect their efficiency.

Article information

Source
Statist. Sci., Volume 33, Number 3 (2018), 386-412.

Dates
First available in Project Euclid: 13 August 2018

Permanent link to this document
https://projecteuclid.org/euclid.ss/1534147229

Digital Object Identifier
doi:10.1214/18-STS648

Mathematical Reviews number (MathSciNet)
MR3843382

Keywords
Bayesian statistics big data Bouncy Particle Sampler continuous-time importance sampling control variates SCALE Zig-Zag Sampler

Citation

Fearnhead, Paul; Bierkens, Joris; Pollock, Murray; Roberts, Gareth O. Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo. Statist. Sci. 33 (2018), no. 3, 386--412. doi:10.1214/18-STS648. https://projecteuclid.org/euclid.ss/1534147229


Export citation

References

  • Andrieu, C. and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37 697–725.
  • Baker, J., Fearnhead, P., Fox, E. B. and Nemeth, C. (2017). Control Variates for Stochastic Gradient MCMC. ArXiv e-prints, 1706.05439.
  • Bardenet, R., Doucet, A. and Holmes, C. (2017). On Markov chain Monte Carlo methods for tall data. J. Mach. Learn. Res. 18 Paper No. 47, 43.
  • Beskos, A., Papaspiliopoulos, O. and Roberts, G. O. (2006). Retrospective exact simulation of diffusion sample paths with applications. Bernoulli 12 1077–1098.
  • Beskos, A., Papaspiliopoulos, O. and Roberts, G. O. (2008). A factorisation of diffusion measure and finite sample path constructions. Methodol. Comput. Appl. Probab. 10 85–104.
  • Beskos, A. and Roberts, G. O. (2005). Exact simulation of diffusions. Ann. Appl. Probab. 15 2422–2444.
  • Bierkens, J. (2016). Non-reversible Metropolis–Hastings. Stat. Comput. 26 1213–1228.
  • Bierkens, J. and Duncan, A. (2017). Limit theorems for the zig-zag process. Adv. in Appl. Probab. 49 791–825.
  • Bierkens, J., Fearnhead, P. and Roberts, G. (2016). The Zig–Zag Process and super-efficient sampling for Bayesian analysis of big data. Ann. Statist. To appear. arXiv:1607.03188.
  • Bierkens, J. and Roberts, G. (2017). A piecewise deterministic scaling limit of lifted Metropolis–Hastings in the Curie–Weiss model. Ann. Appl. Probab. 27 846–882.
  • Bierkens, J., Bouchard-Cote, A., Duncan, A., Doucet, A., Fearnhead, P., Roberts, G. and Vollmer, S. (2017). Piecewise deterministic Markov processes for scalable Monte Carlo on restricted domains. Statist. Probab. Lett. To appear. arXiv:1701.04244.
  • Bouchard-Côté, A., Vollmer, S. J. and Doucet, A. (2017). The bouncy particle sampler: A non-reversible rejection-free Markov chain Monte Carlo method. J. Amer. Statist. Assoc. To appear.
  • Burq, Z. A. and Jones, O. D. (2008). Simulation of Brownian motion at first-passage times. Math. Comput. Simulation 77 64–71.
  • Carpenter, J., Clifford, P. and Fearnhead, P. (1999). An improved particle filter for non-linear problems. IEE Proc. Radar Sonar Navig. 146 2–7.
  • Çinlar, E. (1975). Introduction to Stochastic Processes. Prentice-Hall, Englewood Cliffs, NJ.
  • Davis, M. H. A. (1984). Piecewise-deterministic Markov processes: A general class of nondiffusion stochastic models. J. Roy. Statist. Soc. Ser. B 46 353–388.
  • Davis, M. H. A. (1993). Markov Models and Optimization. Monographs on Statistics and Applied Probability 49. Chapman & Hall, London.
  • Del Moral, P. and Guionnet, A. (2001). On the stability of interacting processes with applications to filtering and genetic algorithms. Ann. Inst. Henri Poincaré B, Probab. Stat. 37 155–194.
  • Diaconis, P., Holmes, S. and Neal, R. M. (2000). Analysis of a nonreversible Markov chain sampler. Ann. Appl. Probab. 10 726–752.
  • Douc, R., Moulines, E. and Olsson, J. (2014). Long-term stability of sequential Monte Carlo methods under verifiable conditions. Ann. Appl. Probab. 24 1767–1802.
  • Doucet, A., Godsill, S. J. and Andrieu, C. (2000). On sequential Monte Carlo sampling methods for Bayesian filtering. Stat. Comput. 10 197–208.
  • Dubey, K. A., Reddi, S. J., Williamson, S. A., Poczos, B., Smola, A. J. and Xing, E. P. (2016). Variance reduction in stochastic gradient Langevin dynamics. In Advances in Neural Information Processing Systems 1154–1162.
  • Ethier, S. N. and Kurtz, T. G. (2005). Markov Processes: Characterization and Convergence. Wiley Series in Probability and Statistics. Wiley, New York.
  • Fearnhead, P., Papaspiliopoulos, O. and Roberts, G. O. (2008). Particle filters for partially observed diffusions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 755–777.
  • Fearnhead, P., Latuszynski, K., Roberts, G. O. and Sermaidis, G. (2016). Continuous-time importance sampling: Monte Carlo methods which avoid time-discretisation error. Available at https://arxiv.org/abs/1712.06201.
  • Foulkes, W., Mitas, L., Needs, R. and Rajagopal, G. (2001). Quantum Monte Carlo simulations of solids. Rev. Modern Phys. 73 33.
  • Girolami, M. and Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 123–214.
  • Gustafson, P. (1998). A guided walk Metropolis algorithm. Stat. Comput. 8 357–364.
  • Kitagawa, G. (1996). Monte Carlo filter and smoother for non-Gaussian nonlinear state space models. J. Comput. Graph. Statist. 5 1–25.
  • Lewis, P. A. W. and Shedler, G. S. (1979). Simulation of nonhomogeneous Poisson processes by thinning. Nav. Res. Logist. Q. 26 403–413.
  • Li, C., Srivastava, S. and Dunson, D. B. (2017). Simple, scalable and accurate posterior interval estimation. Biometrika 104 665–680.
  • Liu, J. S. and Chen, R. (1995). Blind deconvolution via sequential imputations. J. Amer. Statist. Assoc. 90 567–576.
  • Liu, J. S. and Chen, R. (1998). Sequential Monte Carlo methods for dynamic systems. J. Amer. Statist. Assoc. 93 1032–1044.
  • Lyne, A.-M., Girolami, M., Atchadé, Y., Strathmann, H. and Simpson, D. (2015). On Russian roulette estimates for Bayesian inference with doubly-intractable likelihoods. Statist. Sci. 30 443–467.
  • Ma, Y.-A., Chen, T. and Fox, E. (2015). A complete recipe for stochastic gradient MCMC. In Advances in Neural Information Processing Systems 2917–2925.
  • McGrayne, S. B. (2011). The Theory That Would Not die: How Bayes’ Rule Cracked the Enigma Code, Hunted down Russian Submarines, & Emerged Triumphant from Two Centuries of Controversy. Yale Univ. Press, New Haven, CT.
  • Neal, R. M. (1998). Suppressing random walks in Markov chain Monte Carlo using ordered overrelaxation. In Learning in Graphical Models 205–228. Springer, Berlin.
  • Neal, R. M. (2003). Slice sampling. Ann. Statist. 31 705–767.
  • Neal, R. M. (2004). Improving asymptotic variance of MCMC estimators: non-reversible chains are better Technical report, No. 0406, Department of Statistics, University of Toronto.
  • Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC Handb. Mod. Stat. Methods 113–162. CRC Press, Boca Raton, FL.
  • Neiswanger, W., Wang, C. and Xing, E. P. (2014). Asymptotically exact, embarrassingly parallel MCMC. In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence 623–632. AUAI Press, Arlington.
  • Øksendal, B. (1985). Stochastic Differential Equations. Universitext. Springer, Berlin.
  • Pakman, A., Gilboa, D., Carlson, D. and Paninski, L. (2017). Stochastic bouncy particle sampler. In Proceedings of ICML.
  • Peters, E. A. J. F. and de With, G. (2012). Rejection-free Monte Carlo sampling for general potentials. Phys. Rev. E (3) 85 026703.
  • Pollock, M., Johansen, A. M. and Roberts, G. O. (2016). On the exact and $\varepsilon$-strong simulation of (jump) diffusions. Bernoulli 22 794–856.
  • Pollock, M., Fearnhead, P., Johansen, A. and Roberts, G. O. (2016). An unbiased and scalable Monte Carlo method for Bayesian inference for big data. arXiv:1609.03436.
  • Quiroz, M., Villani, M. and Kohn, R. (2015). Speeding up MCMC by efficient data subsampling. J. Amer. Statist. Assoc. To appear. Available at https://www.tandfonline.com/doi/abs/10.1080/01621459.2018.1448827.
  • Robert, C. and Casella, G. (2011). A short history of Markov chain Monte Carlo: Subjective recollections from incomplete data. Statist. Sci. 26 102–115.
  • Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 255–268.
  • Scott, S. L., Blocker, A. W., Bonassi, F. V., Chipman, H. A., George, E. I. and McCulloch, R. E. (2016). Bayes and big data: The consensus Monte Carlo algorithm. Int. J. Manag. Sci. Eng. Manag. 11 78–88.
  • Sherlock, C. and Thiery, A. H. (2017). A discrete bouncy particle sampler. arXiv:1707.05200.
  • Srivastava, S., Cevher, V., Tran-Dinh, Q. and Dunson, D. B. (2015). WASP: Scalable Bayes via barycenters of subset posteriors. In AISTATS.
  • Tierney, L. and Mira, A. (1999). Some adaptive Monte Carlo methods for Bayesian inference. Stat. Med. 18 2507–2515.
  • Vanetti, P., Bouchard-Côté, A., Deligiannidis, G. and Doucet, A. (2017). Piecewise deterministic Markov chain Monte Carlo. arXiv:1707.05296.
  • Welling, M. and Teh, Y. W. (2011). Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th International Conference on Machine Learning (ICML-11) 681–688.