Bayesian Analysis

Perfect Simulation for Mixtures with Known and Unknown Number of Components

Sabyasachi Mukhopadhyay and Sourabh Bhattacharya

Full-text: Open access

Abstract

We propose and develop a novel and effective perfect sampling methodology for simulating from posteriors corresponding to mixtures with either known (fixed) or unknown number of components. For the latter we consider the Dirichlet process-based mixture model developed by these authors, and show that our ideas are applicable to conjugate, and importantly, to non-conjugate cases. As to be expected, and as we show, perfect sampling for mixtures with known number of components can be achieved with much less effort with a simplified version of our general methodology, whether or not conjugate or non-conjugate priors are used. While no special assumption is necessary in the conjugate set-up for our theory to work, we require the assumption of compact parameter space in the non-conjugate set-up. However, we argue, with appropriate analytical, simulation, and real data studies as support, that such compactness assumption is not unrealistic and is not an impediment in practice. Not only do we validate our ideas theoretically and with simulation studies, but we also consider application of our proposal to three real data sets used by several authors in the past in connection with mixture models. The results we achieved in each of our experiments with either simulation study or real data application, are quite encouraging. However, the computation can be extremely burdensome in the case of large number of mixture components and in massive data sets. We discuss the role of parallel processing in mitigating the extreme computational burden.

Article information

Source
Bayesian Anal. Volume 7, Number 3 (2012), 675-714.

Dates
First available in Project Euclid: 28 August 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1346158780

Digital Object Identifier
doi:10.1214/12-BA723

Mathematical Reviews number (MathSciNet)
MR2981632

Zentralblatt MATH identifier
1330.60090

Keywords
Bounding chains Dirichlet process Gibbs sampling Mixtures Optimization Perfect Sampling

Citation

Mukhopadhyay, Sabyasachi; Bhattacharya, Sourabh. Perfect Simulation for Mixtures with Known and Unknown Number of Components. Bayesian Anal. 7 (2012), no. 3, 675--714. doi:10.1214/12-BA723. https://projecteuclid.org/euclid.ba/1346158780


Export citation

References

  • Berthelsen, K. K., Breyer, L. A., and Roberts, G. O. (2010). “Perfect Posterior Simulation for Mixture and Hidden Markov Models.” LMS Journal of Computation and Mathematics, 3: 245–259.
  • Bhattacharya, S. (2008). “Gibbs Sampling Based Bayesian Analysis of Mixtures with Unknown Number of Components.” Sankhya. Series B, 70: 133–155.
  • Box, G. and Muller, M. (1958). “A note on the generation of random normal variates.” Annals of Mathematical Statistics, 29: 610–611.
  • Casella, G., Lavine, M., and Robert, C. P. (2001). “Explaining the Perfect Sampler.” The American Statistician, 55: 299–305.
  • Casella, G., Mengersen, K., Robert, C. P., and Titterington, D. (2002). “Perfect slice samplers for mixtures of distributions.” Journal of the Royal Statistical Society. Series B, 64: 777–790.
  • Escobar, M. D. and West, M. (1995). “Bayesian Density Estimation and Inference Using Mixtures.” Journal of the American Statistical Association, 90(430): 577–588.
  • Fearnhead, P. (2005). “Direct simulation for discrete mixture distributions.” Statistics and Computing, 15: 125–133.
  • Ferguson, T. S. (1974). “A Bayesian Analysis of Some Nonparametric Problems.” The Annals of Statistics, 1: 209–230.
  • Foss, S. G. and Tweedie, R. L. (1998). “Perfect simulation and backward coupling.” Stochastic Models, 14: 187–203.
  • Gilks, W. (1992). “Derivative-free adaptive rejection sampling for Gibbs sampling.” In Bernardo, J. M., Berger, J. O., Dawid, A. P., and Smith, A. F. M. (eds.), Bayesian Statistics 4, 641–649. Oxford University Press.
  • Gilks, W. and Wild, P. (1992). “Adaptive rejection sampling for Gibbs sampling.” Applied Statistics, 41: 337–348.
  • Green, P. J. and Murdoch, D. (1999). “Exact sampling for Bayesian inference: towards general purpose algorithms.” In Berger, J. O., Bernardo, J. M., Dawid, A. P., Lindley, D., and Smith, A. F. M. (eds.), Bayesian Statistics 6, 302–321. Oxford University Press.
  • Hobert, J. P., Robert, C. P., and Titterington, D. M. (1999). “On perfect simulation for some mixtures of distributions.” Statistics and Computing, 9: 287–298.
  • Huber, M. (2004). “Perfect Sampling Using Bounding Chains.” Annals of Applied Probability, 14: 734–753.
  • Huber, M. L. (1998). “Exact Sampling and Approximate Counting Techniques.” In Proceedings of the 30th Symposium on the Theory of Computing, 31–40.
  • Mira, A., Møller, J., and Roberts, G. O. (2001). “Perfect Slice Samplers.” Journal of the Royal Statistical Society. Series B., 63: 593–606.
  • Møller, J. (1999). “Perfect Simulation of Conditionally Specified Models.” Journal of the Royal Statistical Society. Series B., 61: 251–264.
  • Mukhopadhyay, S., Bhattacharya, S., and Dihidar, K. (2011). “On Bayesian Central Clustering: Application to Landscape Classification of Western Ghats.” Annals of Applied Statistics, 5: 1948–1977.
  • Mukhopadhyay, S., Roy, S., and Bhattacharya, S. (2012). “Fast and Efficient Bayesian Semi-Parametric Curve-Fitting and Clustering in Massive Data.” Sankhya. Series B. To appear.
  • Murdoch, D. and Green, P. J. (1998). “Exact sampling for a continuous state.” Scandinavian Journal of Statistics, 25: 483–502.
  • Neal, R. M. (2000). “Markov chain sampling methods for Dirichlet process mixture models.” Journal of Computational and Graphical Statistics, 9: 249–265.
  • Propp, J. G. and Wilson, D. B. (1996). “Exact sampling with coupled Markov chains and applications to statistical mechanics.” Random Structures and Algorithms, 9: 223–252.
  • Richardson, S. and Green, P. J. (1997). “On Bayesian analysis of mixtures with an unknown number of components (with discussion).” Journal of the Royal Statistical Society. Series B, 59: 731–792.
  • Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods. New York: Springer-Verlag.
  • Roberts, G. O. and Rosenthal, J. S. (1998). “Markov chain Monte Carlo: Some practical implications of theoretical results (with Discussion).” Canadian Journal of Statistics, 26: 5–31.
  • Schneider, U. and Corcoran, J. N. (2004). “Perfect sampling for Bayesian variable selection in a linear regression model.” Journal of Statistical Planning and Inference, 126: 153–171.