Bayesian Analysis

Approximate Bayesian Computation by Modelling Summary Statistics in a Quasi-likelihood Framework

Stefano Cabras, Maria Eugenia Castellanos Nueda, and Erlis Ruli

Full-text: Open access

Abstract

Approximate Bayesian Computation (ABC) is a useful class of methods for Bayesian inference when the likelihood function is computationally intractable. In practice, the basic ABC algorithm may be inefficient in the presence of discrepancy between prior and posterior. Therefore, more elaborate methods, such as ABC with the Markov chain Monte Carlo algorithm (ABC-MCMC), should be used. However, the elaboration of a proposal density for MCMC is a sensitive issue and very difficult in the ABC setting, where the likelihood is intractable. We discuss an automatic proposal distribution useful for ABC-MCMC algorithms. This proposal is inspired by the theory of quasi-likelihood (QL) functions and is obtained by modelling the distribution of the summary statistics as a function of the parameters. Essentially, given a real-valued vector of summary statistics, we reparametrize the model by means of a regression function of the statistics on parameters, obtained by sampling from the original model in a pilot-run simulation study. The QL theory is well established for a scalar parameter, and it is shown that when the conditional variance of the summary statistic is assumed constant, the QL has a closed-form normal density. This idea of constructing proposal distributions is extended to non constant variance and to real-valued parameter vectors. The method is illustrated by several examples and by an application to a real problem in population genetics.

Article information

Source
Bayesian Anal., Volume 10, Number 2 (2015), 411-439.

Dates
First available in Project Euclid: 2 February 2015

Permanent link to this document
https://projecteuclid.org/euclid.ba/1422884980

Digital Object Identifier
doi:10.1214/14-BA921

Mathematical Reviews number (MathSciNet)
MR3420888

Zentralblatt MATH identifier
1335.62041

Keywords
Estimating function Likelihood-free methods Markov chain Monte Carlo Proposal distribution Pseudo-likelihood

Citation

Cabras, Stefano; Castellanos Nueda, Maria Eugenia; Ruli, Erlis. Approximate Bayesian Computation by Modelling Summary Statistics in a Quasi-likelihood Framework. Bayesian Anal. 10 (2015), no. 2, 411--439. doi:10.1214/14-BA921. https://projecteuclid.org/euclid.ba/1422884980


Export citation

References

  • Adimari, G. and Ventura, L. (2002). “Quasi-Profile Log Likelihoods for Unbiased Estimating Functions.” Annals of the Institute of Statistical Mathematics, 54: 235–244.
  • Aeschbacher, S., Beaumont, M. A., and Futschik, A. (2012). “A novel approach for choosing summary statistics in approximate Bayesian computation.” Genetics, 192(3): 1027–1047.
  • Andrieu, C. and Roberts, G. (2009). “The pseudo-marginal approach for efficient Monte Carlo computations.” The Annals of Statistics, 37(2): 697–725.
  • Atchadé, Y. F. and Perron, F. (2007). “On the geometric ergodicity of Metropolis-Hastings algorithms.” Statistics, 41(1): 77–84.
  • Barndorff-Nielsen, O. (1995). “Quasi profile and directed likelihoods from estimating functions.” Annals of the Institute of Statistical Mathematics, 47: 461–464.
  • Barnes, C. P., Filippi, S., Stumpf, M. P., and Thorne, T. (2012). “Considerate approaches to constructing summary statistics for ABC model selection.” Statistics and Computing, 22(6): 1181–1197.
  • Beaumont, M., Cornuet, J. M., Marin, J. M., and Robert, C. P. (2009a). “Adaptive approximate Bayesian computation.” Biometrika, 96(4): 983–990.
  • Beaumont, M. A., Cornuet, J.-M., Marin, J.-M., and Robert, C. P. (2009b). “Adaptive approximate Bayesian computation.” Biometrika, 96: 983–990.
  • Bellio, R., Greco, L., and Ventura, L. (2008). “Modified quasi-profile likelihoods from estimating functions.” Journal of Statistical Planning and Inference, 138: 3059–3068.
  • Biau, G., Cérou, F., and Guyader, A. (2012). “New Insights into Approximate Bayesian Computation.” arXiv preprint, arXiv:1207.6461.
  • Blum, M. G. B. and François, O. (2010). “Non-linear regression models for Approximate Bayesian Computation.” Statistics and Computing, 20: 63–73.
  • Blum, M. G. B., Nunes, M., Prangle, D., and Sisson, S. A. (2013). “A comparative review of dimension reduction methods in approximate Bayesian computation.” Statistical Science, 28(2): 135–281.
  • Blum, M. G. B. and Tran, V. C. (2010). “HIV with contact tracing: a case study in approximate Bayesian computation.” Biostatistics, 11: 644–660.
  • Bortot, P., Coles, S. G., and Sisson, S. A. (2007). “Inference for stereological extremes.” Journal of the American Statistical Association, 102: 84–92.
  • Bowley, A. L. (1937). Elements of statistics, volume 8 of Studies in economics and political science. London: P. S. King & Son, Ltd.; New York, C. Scribner’s Sons, 6th edition.
  • Cabras, S., Castellanos, M. E., Biino, G., Persico, I., Sassu, A., Casula, L., del Giacco, S., Bertolino, F., Pirastu, M., and Pirastu, N. (2011). “A strategy analysis for genetic association studies with known inbreeding.” BMC Genetics, 12:63.
  • Cabras, S., Castellanos, M. E., and Ruli, E. (2014). “A Quasi likelihood approximation of posterior distributions for likelihood-intractable complex models.” Metron (in press).
  • Cornuet, J. M., Santos, F., Beaumont, M. A., Robert, C. P., Marin, J. M., Balding, D. J., Guillemaud, T., and Estoup, A. (2008). “Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation.” Bioinformatics, 24: 2713–2719.
  • Crow, E. L. and Siddiqui, M. M. (1967). “Robust Estimation of Location.” Journal of the American Statistical Association, 62: 353–389.
  • Dennis, J. E. J. and Schnabel, R. B. (1996). Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Siam.
  • Desmond, A. F. (1997). “Optimal estimating functions, quasi-likelihood and statistical modelling.” Journal of Statistical Planning and Inference, 60: 77–104.
  • Faisal, M., Futschik, A., and Hussain, I. (2013). “A new approach to choose acceptance cutoff for approximate Bayesian computation.” Journal of Applied Statistics, 40(4): 862–869.
  • Faraway, J. J. (2006). Extending the Linear Model with R. Springer, New York.
  • Fearnhead, P. and Prangle, D. (2012). “Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation.” Journal of Royal Statistal Society: Series B, 74: 419–474.
  • Foll, M., Beaumont, M. A., and Gaggiotti, O. (2008). “An approximate Bayesian computation approach to overcome biases that arise when using amplified fragment length polymorphism markers to study population structure.” Genetics, 179: 927–939.
  • Gourieroux, C., Monfort, A., and Renault, E. (1993). “Indirect Inference.” Journal of Applied Econometrics, 8: S85–S118.
  • Greco, L., Racugno, W., and Ventura, L. (2008). “Robust likelihood functions in Bayesian inference.” Journal of Statistical Planning and Inference, 138: 1258–1270.
  • Hamilton, G., Currat, M., Ray, N., Heckel, G., Beaumont, M., and Excoffier, L. (2005). “Bayesian estimation of recent migration rates after a spatial expansion.” Genetics, 170: 409–417.
  • Haynes, M. A., MacGillivray, H. L., and Mengersen, K. L. (1997). “Robustness of ranking and selection rules using generalised g-and-k distributions.” Journal of Statistical Planning and Inference, 65: 45–66.
  • Heggland, K. and Frigessi, A. (2004). “Estimating functions in indirect inference.” Journal of the Royal Statistical Society: Series B, 66: 447–462.
  • Heyde, C. C. (1997). Quasi-likelihood and its application: a general approach to optimal parameter estimation. Springer Verlag.
  • Hinkley, D. V. (1975). “On Power Transformations to Symmetry.” Biometrika, 62: 101–111.
  • Jørgensen, B. and Knudsen, S. J. (2004). “Parameter orthogonality and bias adjustment for estimating functions.” Scandinavian Journal of Statistics, 31: 93–114.
  • Lee, A. (2012). “On the choice of MCMC kernels for approximate Bayesian computation with SMC samplers.” In Proceedings of the 2012 Winter Simulation Conference (WSC), 1–12. IEEE.
  • Levitan, M. (1988). Textbook of human genetics, third edition. Oxford University Press.
  • Liang, K. Y. and Zeger, S. L. (1995). “Inference based on estimating functions in the presence of nuisance parameters.” Statistical Science, 10: 158–173.
  • Lin, L. (2006). “Quasi Bayesian likelihood.” Statistical Methodology, 3: 444–455.
  • Marjoram, P., Molitor, J., Plagnol, V., and Tavare, S. (2003). “Markov chain Monte Carlo without likelihoods.” Proceedings of the National Academy of Sciences of the United States of America, 100: 15324–8.
  • McCullagh, P. (1991). “Quasi-likelihood and estimating functions.” In Hinkley, D., Reid, N., and Snell, E. (eds.), Statistical Theory and Modelling, 265–286. Chapman and Hall: London.
  • McVinish, R. (2012). “Improving ABC for quantile distributions.” Statistics and Computing, 22: 1199–1207.
  • Mengersen, K. L., Pudlo, P., and Robert, C. P. (2013). “Approximate Bayesian computation via empirical likelihood.” Proceedings of the National Academy of Sciences of the United States of America, 110(4): 1321–1326.
  • Mengersen, K. L. and Tweedie, R. L. (1996). “Rates of Convergence of the Hastings and Metropolis Algorithms.” The Annals of Statistics, 24: 101–121.
  • Pace, L. and Salvan, A. (1997). Principles of Statistical Inference. Singapore: World Scientific.
  • Pauli, F., Racugno, W., and Ventura, L. (2011). “Bayesian composite marginal likelihoods.” Statistica Sinica, 21: 149–164.
  • Prangle, D., Blum, M. G. B., Popovic, G., and Sisson, S. A. (2013a). “Diagnostic tools of approximate Bayesian computation using the coverage property.” arXiv preprint, arXiv:1301.3166.
  • Prangle, D., Fearnhead, P., Cox, M. P., Biggs, P. J., and French, N. P. (2013b). “Semi-automatic selection of summary statistics for ABC model choice.” arXiv preprint, arXiv:1302.5624v1.
  • Ratmann, O., Camacho, A., Meijer, A., and Donker, G. (2014). “Statistical modelling of summary values leads to accurate Approximate Bayesian Computations.” arXiv preprint, arXiv:1305.4283.
  • Ratmann, O., Jørgensen, O., Hinkley, T., Stumpf, M., Richardson, S., and Wiuf, C. (2007). “Using likelihood-free inference to compare evolutionary dynamics of the protein networks of H. pylori and P. falciparum.” PLoS Computational Biology, 3: 2266–2276.
  • Ruli, E., Sartori, N., and Ventura, L. (2013). “Approximate Bayesian Computation with composite score functions.” arXiv paperprint, arXiv:1311.7286v1.
  • Severini, T. (2002). “Modified estimating functions.” Biometrika, 89: 333–343.
  • Siegmund, K. D., Marjoram, P., and Shibata, D. (2008). “Modeling DNA methylation in a population of cancer cells.” Statistical applications in genetics and molecular biology, 7: 1–21.
  • Sisson, S., Fan, Y., and Tanaka, M. (2007). “Sequential Monte Carlo without likelihoods.” Proceedings of the National Academy of Sciences of the United States of America, 104: 1760–1765.
  • Stone, C. (1985). “Additive regression and other nonparametric models.” Annals of Statistics, 13: 689–705.
  • Tanaka, M. M., Francis, A. R., Luciani, F., and Sisson, S. A. (2006). “Using approximate Bayesian computation to estimate tuberculosis transmission parameters from genotype data.” Genetics, 173: 1511–1520.
  • Tavaré, S., Balding, D. J., Griffiths, R. C., and Donnelly, P. (1997). “Inferring coalescence times from DNA sequence data.” Genetics, 145: 505–18.
  • Toni, T., Welch, D., Strelkowa, N., Ipsen, A., and Stumpf, M. P. H. (2009). “Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems.” Journal of the Royal Society Interface, 6: 187–202.
  • Ventura, L., Cabras, S., and Racugno, W. (2010). “Default prior distributions from quasi- and quasi-profile likelihoods.” Journal of Statistical Planning and Inference, 140: 2937–2942.
  • Wang, M. and Hanfelt, J. J. (2003). “Adjusted profile estimating function.” Biometrika, 90: 845–858.
  • Wood, S. N. (2010). “Statistical inference for noisy nonlinear ecological dynamic systems.” Nature, 466: 1102–1104.