Electronic Journal of Statistics

Use in practice of importance sampling for repeated MCMC for Poisson models

Dorota Gajda, Chantal Guihenneuc-Jouyaux, Judith Rousseau, Kerry Mengersen, and Darfiana Nur

Full-text: Open access

Abstract

The Importance Sampling method is used as an alternative approach to MCMC in repeated Bayesian estimations. In the particular context of numerous data sets, MCMC algorithms have to be called on several times which may become computationally expensive. Since Importance Sampling requires a sample from a posterior distribution, our idea is to use MCMC to generate only a certain number of Markov chains and use them later in the subsequent IS estimations. For each Importance Sampling procedure, the suitable chain is selected by one of three criteria we present here. The first and second criteria are based on the L1 norm of the difference between two posterior distributions and their Kullback-Leibler divergence respectively. The third criterion results from minimizing the variance of IS estimate. A supplementary automatic selection procedure is also proposed to choose those posterior for which Markov chains will be generated and to avoid arbitrary choice of importance functions. The featured methods are illustrated in simulation studies on three types of Poisson model: simple Poisson model, Poisson regression model and Poisson regression model with extra Poisson variability. Different parameter settings are considered.

Article information

Source
Electron. J. Statist. Volume 4 (2010), 361-383.

Dates
First available in Project Euclid: 17 March 2010

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1268831481

Digital Object Identifier
doi:10.1214/09-EJS527

Mathematical Reviews number (MathSciNet)
MR2645489

Zentralblatt MATH identifier
1329.62129

Subjects
Primary: 62F15: Bayesian inference 65C05: Monte Carlo methods
Secondary: 65C60: Computational problems in statistics

Keywords
MCMC Importance Sampling Poisson model

Citation

Gajda, Dorota; Guihenneuc-Jouyaux, Chantal; Rousseau, Judith; Mengersen, Kerry; Nur, Darfiana. Use in practice of importance sampling for repeated MCMC for Poisson models. Electron. J. Statist. 4 (2010), 361--383. doi:10.1214/09-EJS527. https://projecteuclid.org/euclid.ejs/1268831481


Export citation

References

  • Asmussen, S., Kroese, D. P. and Rubinstein, R. Y. (2005). Heavy tails, importance sampling and cross-entropy., Stoch. Models 21 57–76.
  • Breslow, N. E. (1984). Extra-Poisson Variation in Log-Linear Models., Journal of the Royal Statistical Society. Series C (Applied Statistics) 33 38–44.
  • Brooks, S. P. and Roberts, G. O. (1998). Convergence assessment techniques for Markov chain Monte Carlo., Statistics and Computing 8 319–335. http://dx.doi.org/10.1023/A:1008820505350
  • Cappé, O., Guillin, A., Marin, J. M. and Robert, C. P. (2004). Population Monte Carlo., J. Comput. Graph. Statist. 13 907–929.
  • Cappé, O., Douc, R., Guillin, A., Marin, J. M. and Robert, C. P. (2007). Adaptive Importance Sampling in General Mixture Classes., Statistics and Computing (to appear). Available at http://www.citebase.org/abstract?id=oai:arXiv.org:0710.4242
  • Chen, M.-H. and Shao, Q.-M. (1997). Performance study of marginal posterior density estimation via Kullback-Leibler divergence., Test 6 321–350.
  • Doss, H. (1994). Discussion of the paper “Markov chains for exploring posterior distributions” by Luke Tierney., Ann. Statist. 22 1728–1734.
  • Douc, R., Guillin, A., Marin, J.-M. and Robert, C. P. (2007a). Convergence of adaptive mixtures of importance sampling schemes., Ann. Statist. 35 420–448.
  • Douc, R., Guillin, A., Marin, J.-M. and Robert, C. P. (2007b). Minimum variance importance sampling via population Monte Carlo., ESAIM Probab. Stat. 11 427–447 (electronic).
  • Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods. In, Bayesian statistics, 4 (Peñíscola, 1991) 147–167. Oxford Univ. Press, New York.
  • Geman, S. and Geman, D. (1984). Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images., IEEE Transactions on Pattern Analysis and Machine Intelligence 6 721–740.
  • Geweke, J. (1989). Bayesian inference in econometric models using Monte Carlo integration., Econometrica 57 1317–1339.
  • Geyer, C. J. and Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood for dependent data., J. Roy. Statist. Soc. Ser. B 54 657–699. With discussion and a reply by the authors.
  • Gilks, W., Richardson, S. and Spiegelhalter, D. (1996)., Markov chain Monte Carlo in practice. Interdisciplinary Statistics. Chapman & Hall, London. Edited by W. R. Gilks, S. Richardson and D. J. Spiegelhalter.
  • Gustafson, P. and Wasserman, L. (1995). Local sensitivity diagnostics for Bayesian inference., Ann. Statist. 23 2153–2167.
  • Hastings, W. K. (1970). Monte Carlo Sampling Methods Using Markov Chains and Their Applications., Biometrika 57 97–109.
  • Kaufman, L. and Rousseeuw, P. J. (1990)., Finding groups in data: An introduction to cluster analysis. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons Inc., New York.
  • Maechler, M., Rousseeuw, P., Struyf, A. and Hubert, M. (2005). Cluster Analysis Basics and Extensions. Rousseeuw et al provided the S original which has been ported to R by Kurt Hornik and has since been enhanced by Martin Maechler: speed improvements, silhouette() functionality, bug fixes, etc. See the ’Changelog’ file (in the package, source).
  • McVinish, R., Mengersen, K., Nur, D. C., Rousseau, J. and Guihenneuc-Jouyaux, C. (2008). Use of Importance Sampling for Repeated MCMC. School of Mathematical Sciences, Queensland University of Technology.,
  • Mengersen, K. L., Robert, C. P. and Guihenneuc-Jouyaux, C. (1999). MCMC convergence diagnostics: a reviewww. In, Bayesian statistics, 6 (Alcoceber, 1998) 415–440. Oxford Univ. Press, New York.
  • Nakache, J. P. and Confais, J. (2005)., Approche pragmatique de la classification. Technip.
  • Ng, R. and Han, J. (1994). Efficient and effective clustering methods for spatial data mining. In, Proceedings of the 20th Conference on VLDB, Santiago, Chili 144–155.
  • R Development Core Team, (2008). R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing, Vienna, Austria ISBN 3-900051-07-0. Available at, http://www.R-project.org
  • Robert, C. P. (2007)., The Bayesian choice, Second ed. Springer Texts in Statistics. Springer-Verlag, New York. From decision-theoretic foundations to computational implementation, Translated and revised from the French original by the author.
  • Rubinstein, R. Y. and Kroese, D. P. (2004)., The cross-entropy method: A unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning. Information Science and Statistics. Springer-Verlag, New York.
  • Thomas, A., O’Hara, B., Ligges, U. and Sturtz, S. (2006). Making BUGS Open., R News 6 12–17. Available at http://cran.r-project.org/doc/Rnews/
  • Tierney, L. (1994). Markov chains for exploring posterior distributions., Ann. Statist. 22 1701–1762. With discussion and a rejoinder by the author.
  • Woo, K. G., Lee, J. H., Kim, M. H. and Lee, Y. I. (2004). FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting., Informations & Software Technology 46 255–271.