Bayesian Analysis

Variational inference for Dirichlet process mixtures

David M. Blei and Michael I. Jordan

Full-text: Open access

Abstract

Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of Monte-Carlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis problems. However, MCMC sampling can be prohibitively slow, and it is important to explore alternatives. One class of alternatives is provided by variational methods, a class of deterministic algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003). Thus far, variational methods have mainly been explored in the parametric setting, in particular within the formalism of the exponential family (Attias 2000; Ghahramani and Beal 2001; Blei et al. 2003). In this paper, we present a variational inference algorithm for DP mixtures. We present experiments that compare the algorithm to Gibbs sampling algorithms for DP mixtures of Gaussians and present an application to a large-scale image analysis problem.

Article information

Source
Bayesian Anal. Volume 1, Number 1 (2006), 121-143.

Dates
First available: 22 June 2012

Permanent link to this document
http://projecteuclid.org/euclid.ba/1340371077

Mathematical Reviews number (MathSciNet)
MR2227367

Digital Object Identifier
doi:10.1214/06-BA104

Citation

Blei, David M.; Jordan, Michael I. Variational inference for Dirichlet process mixtures. Bayesian Analysis 1 (2006), no. 1, 121--143. doi:10.1214/06-BA104. http://projecteuclid.org/euclid.ba/1340371077.


Export citation

References

  • Antoniak, C. 1974. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics 2(6): 1152–1174.
  • Attias, H. 2000. A variational Bayesian framework for graphical models. In Advances in Neural Information Processing Systems 12, eds. S. Solla, T. Leen, and K. Muller, 209–215. Cambridge, MA: MIT Press.
  • Barnard, K., P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, and M. Jordan. 2003. Matching words and pictures. Journal of Machine Learning Research 3: 1107–1135.
  • Bertsekas, D. 1999. Nonlinear Programming. Nashua, NH: Athena Scientific.
  • Blackwell, D. and J. MacQueen. 1973. Ferguson distributions via Pólya urn schemes. The Annals of Statistics 1(2): 353–355.
  • Blei, D., A. Ng, and M. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3: 993–1022.
  • Connor, R. and J. Mosimann. 1969. Concepts of independence for proportions with a generalization of the Dirichlet distribution. Journal of the American Statistical Association 64(325): 194–206.
  • Escobar, M. and M. West. 1995. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90: 577–588.
  • Ferguson, T. 1973. A Bayesian analysis of some nonparametric problems. The Annals of Statistics 1: 209–230.
  • Ghahramani, Z. and M. Beal. 2001. Propagation algorithms for variational Bayesian learning. In Advances in Neural Information Processing Systems 13, eds. T. Leen, T. Dietterich, and V. Tresp, 507–513. Cambridge, MA: MIT Press.
  • Ishwaran, J. and L. James. 2001. Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association 96: 161–174.
  • Jeon, J., V. Lavrenko, and R. Manmatha. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, 119–126. ACM Press.
  • Jordan, M., Z. Ghahramani, T. Jaakkola, and L. Saul. 1999. Introduction to variational methods for graphical models. Machine Learning 37: 183–233.
  • MacEachern, S. 1994. Estimating normal means with a conjugate style Dirichlet process prior. Communications in Statistics B 23: 727–741.
  • –-. 1998. Computational methods for mixture of Dirichlet process models. In Practical Nonparametric and Semiparametric Bayesian Statistics, eds. D. Dey, P. Muller, and D. Sinha, 23–44. Springer.
  • Neal, R. 2000. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9(2): 249–265.
  • Opper, M. and D. Saad. 2001. Advanced mean field methods: Theory and practice. Cambridge, MA: MIT Press.
  • Raftery, A. and S. Lewis. 1992. One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science 7: 493–497.
  • Robert, C. and G. Casella. 2004. Monte Carlo Statistical Methods. New York, NY: Springer-Verlag.
  • Sethuraman, J. 1994. A constructive definition of Dirichlet priors. Statistica Sinica 4: 639–650.
  • Wainwright, M. and M. Jordan. 2003. Graphical models, exponential families, and variational inference. Tech. Rep. 649, U.C. Berkeley, Dept. of Statistics.
  • Wiegerinck, W. 2000. Variational approximations between mean field theory and the junction tree algorithm. In Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI-00), eds. C. Boutilier and M. Goldszmidt, 626–633. San Francisco, CA: Morgan Kaufmann Publishers.
  • Xing, E., M. Jordan, and S. Russell. 2003. A generalized mean field algorithm for variational inference in exponential families. In Proceedings of the 19th Annual Conference on Uncertainty in Artificial Intelligence (UAI-03), eds. C. Meek and U. Kjærulff, 583–591. San Francisco, CA: Morgan Kaufmann Publishers.