Bayesian Analysis

Particle learning for general mixtures

Carlos M. Carvalho, Hedibert F. Lopes, Nicholas G. Polson, and Matt A. Taddy

Full-text: Open access

Abstract

This paper develops particle learning (PL) methods for the estimation of general mixture models. The approach is distinguished from alternative particle filtering methods in two major ways. First, each iteration begins by resampling particles according to posterior predictive probability, leading to a more efficient set for propagation. Second, each particle tracks only the "essential state vector" thus leading to reduced dimensional inference. In addition, we describe how the approach will apply to more general mixture models of current interest in the literature; it is hoped that this will inspire a greater number of researchers to adopt sequential Monte Carlo methods for fitting their sophisticated mixture based models. Finally, we show that PL leads to straightforward tools for marginal likelihood calculation and posterior cluster allocation.

Article information

Source
Bayesian Anal. Volume 5, Number 4 (2010), 709-740.

Dates
First available in Project Euclid: 19 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1340110852

Digital Object Identifier
doi:10.1214/10-BA525

Mathematical Reviews number (MathSciNet)
MR2740154

Zentralblatt MATH identifier
1330.62348

Keywords
Nonparametric mixture models particle filtering Dirichlet process Indian buffet process probit stick-breaking

Citation

Carvalho, Carlos M.; Lopes, Hedibert F.; Polson, Nicholas G.; Taddy, Matt A. Particle learning for general mixtures. Bayesian Anal. 5 (2010), no. 4, 709--740. doi:10.1214/10-BA525. https://projecteuclid.org/euclid.ba/1340110852


Export citation

References

  • Antoniak, C. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics 2, 1152 – 1174.
  • Basu, S. and S. Chib (2003). Marginal Likelihood and Bayes Factors for Dirichlet Process Mixture Models. Journal of the American Statistical Association 98, 224–235.
  • Blackwell, D. and J. MacQueen (1973). Ferguson distributions via Pólya urn schemes. Annals of Statistics 1, 353–355.
  • Blei, D. M. and M. I. Jordan (2006). Variational inference for Dirichlet process mixtures. Bayesian Analysis 1(1), 121–144.
  • Bush, C. and S. MacEachern (1996). A semiparametric Bayesian model for randomized block designs. Biometrika 83, 275–285.
  • Carvalho, C. M., M. Johannes, H. F. Lopes, and N. Polson (2010). Particle learning and smoothing. Statistical Science 25, 88-106.
  • Chib, S. (1995). Marginal likelihood from the Gibbs output. Journal of the American Statistical Association 90, 1313–1321.
  • Chib, S. and I. Jeliazkov (2001). Marginal likelihood from the Metropolis-Hastings output. Journal of the American Statistical Association 96, 270–281.
  • Chopin, N. (2002). A sequential particle filter method for static models. Biometrika 89, 539–551.
  • Del Moral, P., Doucet, A. and Jasra, A. (2006). Sequential Monte Carlo samplers. Journal of the Royal Statistical Society, B 68, 411–436.
  • Dunson, D. B. and J.-H. Park (2008). Kernel stick-breaking processes. Biometrika 95, 307–323.
  • Escobar, M. and M. West (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90, 577–588.
  • Fearnhead, P. (2004). Particle filters for mixture models with an unknown number of components. Statistics and Computing 14, 11–21.
  • Fearnhead, P. and Meligkotsidou, L. (2007). Filtering methods for mixture models Journal of Computational and Graphical Statistics 6, 586–607.
  • Ferguson, T. (1973). A Bayesian analysis of some nonparametric problems. Annals of Statistics 1, 209–230.
  • Ferguson, T. S. (1974). Prior distributions on spaces of probability measures. Annals of Statistics 2, 209–230.
  • Ferguson, T. S. (1983). Bayesian density estimation by mixtures of normals distributions. In M. Rizvi, J. Rustagi, D. Siegmund (Eds.), Recent Advances in Statistics, New York Academic Press.
  • Gelfand, A. E. and A. Kottas (2002). A computational approach for full nonparametric Bayesian inference under Dirichlet process mixture models. Journal of Computational and Graphical Statistics 11, 289–305.
  • Gelfand, A. E., A. Kottas, and S. N. MacEachern (2005). Bayesian nonparametric spatial modeling with Dirichlet process mixing. Journal of the American Statistical Association 100, 1021–1035.
  • Godsill, S. J., A. Doucet, and M. West (2004). Monte Carlo smoothing for nonlinear time series. Journal of the American Statistical Association 99, 156–168.
  • Griffin, J. E. and M. F. J. Steel (2006). Order-based dependent Dirichlet processes. Journal of the American Statistical Association 101, 179–194.
  • Griffiths, T. L. and Z. Ghahramani (2006). Infinite latent feature models and the Indian buffet process. In Advances in Neural Information Processing Systems, Volume 18.
  • Han, C. and B. Carlin (2001). Markov chain Monte Carlo methods for computing Bayes factors: A comparative review. Journal of the American Statistical Association 96, 161–173.
  • Ishwaran, H. and L. James (2001). Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association 96, 161–173.
  • Ishwaran, H. and M. Zarepour (2000). Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models. Biometrika 87, 371–390.
  • Jasra, A., Stephens, D. and Holmes, C. (2007). On Population-based simulation for static inference. Statistics and Computing 17, 263–279.
  • Kong, A., J. S. Liu, and W. H. Wong (1994). Sequential imputations and Bayesian missing data problems. Journal of the American Statistical Association 89, 278–288.
  • Kottas, A. (2006). Nonparametric Bayesian survival analysis using mixtures of Weibull distributions. Journal of Statistical Planning and Inference 136, 578–596.
  • Lee, J., F. A. Quintana, P. Müller, and L. Trippa (2008). Defining predictive probability functions for species sampling models. Working Paper.
  • Liu, J. and R. Chen (1998). Sequential Monte Carlo methods for dynamic systems. Journal of the American Statistical Association 93, 1032–1044.
  • Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates. I: Density Estimates. Annals of Statistics 12, 351–357.
  • Lopes, H., C. M. Carvalho, M. Johannes and N. Polson. (2010). Particle Learning for Sequential Bayesian Computation (with discussion). In J. Bernardo, M. J. Bayarri, J. Berger, A. Dawid, D. Heckerman, A. F. M. Smith and M. West (Eds.), Bayesian Statistics, Volume 9. Oxford. In Press.
  • MacEachern, S. N. (2000). Dependent Dirichlet Processes Technical report. The Ohio State University Statistics Department.
  • MacEachern, S. N., M. Clyde, and J. S. Liu (1999). Sequential importance sampling for nonparametric Bayes models The next generation. The Canadian Journal of Statistics 27, 251–267.
  • MacEachern, S. N. and P. Müller (1998). Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics 7, 223–238.
  • McKenzie, E. (1985). An autoregressive process for beta random variables. Management Science 31, 988–997.
  • Muliere, P. and S. Walker (1997). A Bayesian non-parametric approach to survival analysis using Pólya trees. Scandinavian Journal of Statistics 24, 331–340.
  • Müller, P. and F. A. Quintana (2004). Nonparametric Bayesian data analysis. Statistical Science 19, 95–110.
  • Neal, R. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9, 249–265.
  • Paddock, S. M., F. Ruggeri, M. Lavine, and M. West (2003). Randomized Pólya tree models for nonparametric Bayesian inference. Statistica Sinica 13, 443–460.
  • Perman, M., J. Pitman, and M. Yor (1992). Size-biased sampling of Poisson point processes and excursions. Probability Theory and Related Fields 92, 21–39.
  • Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probability Theory and Related Fields 102, 145–158.
  • Pitt, M. and N. Shephard (1999). Filtering via Simulation: Auxiliary Particle Filters. Journal of the American Statistical Association 94, 590–599.
  • Quintana, F. and M. A. Newton (2000). Computational aspects of nonparametric bayesian analysis with applications to the modelling of multiple binary sequences. Journal of Computational and Graphical Statistics 9, 711–737.
  • Regazzini, E. (1998). Old and recent results on the relationship between predictive inference and statistical modelling either in nonparametric or parametric form. In J. Bernardo, J. Berger, A. Dawid, and A. Smith (Eds.), Bayesian Statistics, Volume 6. Oxford.
  • Richardson, S. and P. J. Green (1997). On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society - Series B 59, 731–792.
  • Rodriguez, A. and D. B. Dunson (2009). Nonparametric Bayesian models through probit stick-breaking processes. Technical Report UCSC-SOE-09-12, University of California Santa Cruz.
  • Rodriguez, A., D. B. Dunson, and A. E. Gelfand (2008). The nested Dirichlet process. Journal of the American Statistical Association 103, 1131–1154.
  • Stephens, M. (2000). Dealing with label switching in mixture models. Journal of the Royal Statistical Society, Series B, Methodological 62(4), 795–809.
  • Taddy, M. A. (2010). Autoregressive mixture models for dynamic spatial Poisson processes: Application to tracking intensity of violent crime. Journal of the American Statistical Association to appear.
  • Taddy, M. and A. Kottas (2009). Bayesian nonparametric approach to inference for quantile regression. Journal of Business and Economic Statistics. To Appear.
  • Walker, S. G., P. Damien, P. W. Laud, and A. F. M. Smith (1999). Bayesian nonparametric inference for random distributions and related functions. Journal of the Royal Statistical Society, Series B 61, 485–527.
  • Walker, S. G. and P. Muliere (1997). Beta-Stacy processes and a generalization of the Pólya-urn scheme. The Annals of Statistics 25, 1762–1780.
  • West, M. and J. Harrison (1997). Bayesian Forecasting and Dynamic Models (2nd ed.). Springer Series in Statistics. Springer.