Bayesian Analysis

Parallel Gaussian Process Surrogate Bayesian Inference with Noisy Likelihood Evaluations

Marko Järvenpää, Michael U. Gutmann, Aki Vehtari, and Pekka Marttinen

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access

Abstract

We consider Bayesian inference when only a limited number of noisy log-likelihood evaluations can be obtained. This occurs for example when complex simulator-based statistical models are fitted to data, and synthetic likelihood (SL) method is used to form the noisy log-likelihood estimates using computationally costly forward simulations. We frame the inference task as a sequential Bayesian experimental design problem, where the log-likelihood function is modelled with a hierarchical Gaussian process (GP) surrogate model, which is used to efficiently select additional log-likelihood evaluation locations. Motivated by recent progress in the related problem of batch Bayesian optimisation, we develop various batch-sequential design strategies which allow to run some of the potentially costly simulations in parallel. We analyse the properties of the resulting method theoretically and empirically. Experiments with several toy problems and simulation models suggest that our method is robust, highly parallelisable, and sample-efficient.

Article information

Source
Bayesian Anal., Advance publication (2020), 32 pages.

Dates
First available in Project Euclid: 19 March 2020

Permanent link to this document
https://projecteuclid.org/euclid.ba/1584583229

Digital Object Identifier
doi:10.1214/20-BA1200

Keywords
expensive likelihoods likelihood-free inference surrogate modelling Gaussian processes sequential experiment design parallel computing

Rights
Creative Commons Attribution 4.0 International License.

Citation

Järvenpää, Marko; Gutmann, Michael U.; Vehtari, Aki; Marttinen, Pekka. Parallel Gaussian Process Surrogate Bayesian Inference with Noisy Likelihood Evaluations. Bayesian Anal., advance publication, 19 March 2020. doi:10.1214/20-BA1200. https://projecteuclid.org/euclid.ba/1584583229


Export citation

References

  • Acerbi, L. (2018). “Variational Bayesian Monte Carlo.” In Advances in Neural Information Processing Systems 31, 8223–8233.
  • An, Z., Nott, D. J., and Drovandi, C. (2019a). “Robust Bayesian synthetic likelihood via a semi-parametric approach.” Statistics and Computing.
  • An, Z., South, L. F., Nott, D. J., and Drovandi, C. C. (2019b). “Accelerating Bayesian Synthetic Likelihood with the Graphical Lasso.” Journal of Computational and Graphical Statistics, 28(2): 471–475.
  • Ankenman, B., Nelson, B. L., and Staum, J. (2010). “Stochastic Kriging for Simulation Metamodeling.” Operations Research, 58(2): 371–382.
  • Azimi, J., Alan, F., and Fern, X. Z. (2010). “Batch Bayesian Optimization via Simulation Matching.” In Advances in Neural Information Processing Systems 23, 109–117.
  • Bach, F. (2013). Learning with Submodular Functions: A Convex Optimization Perspective. Hanover, MA, USA: Now Publishers Inc.
  • Beaumont, M. A., Zhang, W., and Balding, D. J. (2002). “Approximate Bayesian computation in population genetics.” Genetics, 162(4): 2025–2035.
  • Bect, J., Bachoc, F., and Ginsbourger, D. (2019). “A supermartingale approach to Gaussian process based sequential design of experiments.” Bernoulli, 25(4A): 2883–2919.
  • Bect, J., Ginsbourger, D., Li, L., Picheny, V., and Vazquez, E. (2012). “Sequential design of computer experiments for the estimation of a probability of failure.” Statistics and Computing, 22(3): 773–793.
  • Briol, F.-X., Oates, C. J., Girolami, M., Osborne, M. A., and Sejdinovic, D. (2019). “Probabilistic Integration: A Role in Statistical Computation?” Statistical Science, 34(1): 1–22.
  • Brochu, E., Cora, V. M., and de Freitas, N. (2010). “A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning.” Available at: https://arxiv.org/abs/1012.2599.
  • Chai, H. R. and Garnett, R. (2019). “Improving Quadrature for Constrained Integrands.” In The 22nd International Conference on Artificial Intelligence and Statistics, 2751–2759.
  • Chevalier, C., Bect, J., Ginsbourger, D., Vazquez, E., Picheny, V., and Richet, Y. (2014). “Fast Parallel Kriging-Based Stepwise Uncertainty Reduction With Application to the Identification of an Excursion Set.” Technometrics, 56(4): 455–465.
  • Cockayne, J., Oates, C., Sullivan, T., and Girolami, M. (2019). “Bayesian Probabilistic Numerical Methods.” SIAM Review, 61(4): 756–789.
  • Contal, E., Buffoni, D., Robicquet, A., and Vayatis, N. (2013). “Parallel Gaussian process optimization with upper confidence bound and pure exploration.” In Lecture Notes in Computer Science.
  • Desautels, T., Krause, A., and Burdick, J. W. (2014). “Parallelizing Exploration-Exploitation Tradeoffs in Gaussian Process Bandit Optimization.” Journal of Machine Learning Research, 15: 4053–4103.
  • Drovandi, C. C., Moores, M. T., and Boys, R. J. (2018). “Accelerating pseudo-marginal MCMC using Gaussian processes.” Computational Statistics & Data Analysis, 118: 1–17.
  • Frazier, D. T., Nott, D. J., Drovandi, C., and Kohn, R. (2019). “Bayesian inference using synthetic likelihood: asymptotics and adjustments.” Available at: https://arxiv.org/abs/1902.04827.
  • Gardner, J., Kusner, M., Zhixiang, Weinberger, K., and Cunningham, J. (2014). “Bayesian Optimization with Inequality Constraints.” In Proceedings of the 31st International Conference on Machine Learning, volume 32, 937–945.
  • Ginsbourger, D., Le Riche, R., and Carraro, L. (2010). Kriging Is Well-Suited to Parallelize Optimization, 131–162. Berlin, Heidelberg: Springer Berlin Heidelberg.
  • Gonzalez, J., Dai, Z., Lawrence, N. D., and Hennig, P. (2016). “Batch Bayesian Optimization via Local Penalization.” In International Conference on Artificial Intelligence and Statistics, 1, 648–657.
  • González, J., Osborne, M., and Lawrence, N. D. (2016). “GLASSES: Relieving The Myopia of Bayesian Optimisation.” In Proceedings of the Nineteenth International Workshop on Artificial Intelligence and Statistics.
  • Gunter, T., Osborne, M. A., Garnett, R., Hennig, P., and Roberts, S. J. (2014). “Sampling for Inference in Probabilistic Models with Fast Bayesian Quadrature.” In Advances in Neural Information Processing Systems 27, 2789–2797.
  • Gutmann, M. U. and Corander, J. (2016). “Bayesian optimization for likelihood-free inference of simulator-based statistical models.” Journal of Machine Learning Research, 17(125): 1–47.
  • Hennig, P., Osborne, M. A., and Girolami, M. (2015). “Probabilistic numerics and uncertainty in computations.” Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 471(2179): 20150142.
  • Hennig, P. and Schuler, C. J. (2012). “Entropy Search for Information-Efficient Global Optimization.” Journal of Machine Learning Research, 13(1999): 1809–1837.
  • Hernández-Lobato, J. M., Hoffman, M. W., and Ghahramani, Z. (2014). “Predictive Entropy Search for Efficient Global Optimization of Black-box Functions.” Advances in Neural Information Processing Systems 28, 1–9.
  • Jabot, F., Lagarrigues, G., Courbaud, B., and Dumoulin, N. (2014). “A comparison of emulation methods for Approximate Bayesian Computation.” Available at: http://arxiv.org/abs/1412.7560.
  • Järvenpää, M., Gutmann, M. U., Pleska, A., Vehtari, A., and Marttinen, P. (2019). “Efficient Acquisition Rules for Model-Based Approximate Bayesian Computation.” Bayesian Analysis, 14(2): 595–622.
  • Järvenpää, M., Gutmann, M. U., Vehtari, A., and Marttinen, P. (2018). “Gaussian process modelling in approximate Bayesian computation to estimate horizontal gene transfer in bacteria.” The Annals of Applied Statistics, 12(4): 2228–2251.
  • Järvenpää, M., Gutmann, M. U., Vehtari, A., and Marttinen, P. (2020). “Parallel Gaussian Process Surrogate Bayesian Inference with Noisy Likelihood Evaluations – Supplementary Material.” Bayesian Analysis.
  • Kandasamy, K., Schneider, J., and Póczos, B. (2015). “Bayesian active learning for posterior estimation.” In International Joint Conference on Artificial Intelligence, 3605–3611.
  • Karvonen, T., Oates, C. J., and Särkkä, S. (2018). “A Bayes-Sard Cubature Method.” In Advances in Neural Information Processing Systems 31, 5886–5897.
  • Kennedy, M. C. and O’Hagan, A. (2001). “Bayesian calibration of computer models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(3): 425–464.
  • Krause, A. and Cevher, V. (2010). “Submodular Dictionary Selection for Sparse Representation.” In Proceedings of the 27th International Conference on International Conference on Machine Learning, 567–574.
  • Krause, A., Singh, A., and Guestrin, C. (2008). “Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies.” Journal of Machine Learning Research, 9: 235–284.
  • Lintusaari, J., Gutmann, M. U., Dutta, R., Kaski, S., and Corander, J. (2017). “Fundamentals and Recent Developments in Approximate Bayesian Computation.” Systematic biology, 66(1): e66–e82.
  • Lyu, X., Binois, M., and Ludkovski, M. (2018). “Evaluating Gaussian Process Metamodels and Sequential Designs for Noisy Level Set Estimation.” Available at: http://arxiv.org/abs/1807.06712.
  • Marin, J. M., Pudlo, P., Robert, C. P., and Ryder, R. J. (2012). “Approximate Bayesian computational methods.” Statistics and Computing, 22(6): 1167–1180.
  • Marttinen, P., Gutmann, M. U., Croucher, N. J., Hanage, W. P., and Corander, J. (2015). “Recombination produces coherent bacterial species clusters in both core and accessory genomes.” Microbial Genomics, 1(5).
  • Meeds, E. and Welling, M. (2014). “GPS-ABC: Gaussian Process Surrogate Approximate Bayesian Computation.” In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence.
  • Nemhauser, G. L., Wolsey, L. A., and Fisher, M. L. (1978). “An analysis of approximations for maximizing submodular set functions—I.” Mathematical Programming, 14(1): 265–294.
  • O’Hagan, A. (1991). “Bayes-Hermite quadrature.” Journal of Statistical Planning and Inference.
  • O’Hagan, A. and Kingman, J. F. C. (1978). “Curve Fitting and Optimal Design for Prediction.” Journal of the Royal Statistical Society. Series B (Methodological), 40(1): 1–42.
  • Osborne, M. A., Duvenaud, D., Garnett, R., Rasmussen, C. E., Roberts, S. J., and Ghahramani, Z. (2012). “Active Learning of Model Evidence Using Bayesian Quadrature.” Advances in Neural Information Processing Systems 26, 1–9.
  • Picheny, V., Ginsbourger, D., Richet, Y., and Caplin, G. (2013). “Quantile-Based Optimization of Noisy Computer Experiments With Tunable Precision.” Technometrics, 55(1): 2–13.
  • Price, L. F., Drovandi, C. C., Lee, A., and Nott, D. J. (2018). “Bayesian Synthetic Likelihood.” Journal of Computational and Graphical Statistics, 27(1): 1–11.
  • Rasmussen, C. E. (2003). “Gaussian Processes to Speed up Hybrid Monte Carlo for Expensive Bayesian Integrals.” Bayesian Statistics 7, 651–659.
  • Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. The MIT Press.
  • Riihimäki, J. and Vehtari, A. (2014). “Laplace Approximation for Logistic Gaussian Process Density Estimation and Regression.” Bayesian Analysis, 9(2): 425–448.
  • Robert, C. P. (2007). The Bayesian Choice. New York: Springer, second edition.
  • Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods. New York: Springer, second edition.
  • Shah, A. and Ghahramani, Z. (2015). “Parallel Predictive Entropy Search for Batch Global Optimization of Expensive Objective Functions.” In Advances in Neural Information Processing Systems 28, 12.
  • Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., and de Freitas, N. (2015). “Taking the human out of the loop: A review of Bayesian optimization.” Proceedings of the IEEE, 104(1).
  • Sinsbeck, M. and Nowak, W. (2017). “Sequential Design of Computer Experiments for the Solution of Bayesian Inverse Problems.” SIAM/ASA Journal on Uncertainty Quantification, 5(1): 640–664.
  • Snoek, J., Larochelle, H., and Adams, R. P. (2012). “Practical Bayesian optimization of machine learning algorithms.” In Advances in Neural Information Processing Systems 25, 1–9.
  • Srinivas, N., Krause, A., Kakade, S., and Seeger, M. (2010). “Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design.” In Proceedings of the 27th International Conference on Machine Learning, 1015–1022.
  • Stuart, A. M. and Teckentrup, A. L. (2018). “Posterior consistency for Gaussian process approximations of Bayesian posterior distributions.” Mathematics for Computing, 87: 721–753.
  • Sui, Y., Gotovos, A., Burdick, J., and Krause, A. (2015). “Safe Exploration for Optimization with Gaussian Processes.” In Proceedings of the 32nd International Conference on Machine Learning, volume 37, 997–1005.
  • Thomas, O., Dutta, R., Corander, J., Kaski, S., and Gutmann, M. U. (2018). “Likelihood-free inference by ratio estimation.” Available at: https://arxiv.org/abs/1611.10242.
  • Turner, B. M. and Van Zandt, T. (2012). “A tutorial on approximate Bayesian computation.” Journal of Mathematical Psychology, 56(2): 69–85.
  • Wang, H. and Li, J. (2018). “Adaptive Gaussian Process Approximation for Bayesian Inference with Expensive Likelihood Functions.” Neural Computation, 30(11): 3072–3094.
  • Wilkinson, R. D. (2014). “Accelerating ABC methods using Gaussian processes.” In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics.
  • Wilson, J., Hutter, F., and Deisenroth, M. (2018). “Maximizing acquisition functions for Bayesian optimization.” In Advances in Neural Information Processing Systems 31, 9906–9917.
  • Wood, S. N. (2010). “Statistical inference for noisy nonlinear ecological dynamic systems.” Nature, 466: 1102–1104.
  • Wu, J. and Frazier, P. (2016). “The Parallel Knowledge Gradient Method for Batch Bayesian Optimization.” In Advances in Neural Information Processing Systems 29, 3126–3134.
  • Yu, C. W. and Clarke, B. (2011). “Median loss decision theory.” Journal of Statistical Planning and Inference, 141(2): 611–623.

Supplemental materials

  • Supplementary material of “Parallel Gaussian process surrogate Bayesian inference with noisy likelihood evaluations”. The supplementary material contains proofs, implementation details and additional experimental results.