Bayesian Analysis

Feature Allocations, Probability Functions, and Paintboxes

Tamara Broderick, Jim Pitman, and Michael I. Jordan

Full-text: Open access

Abstract

The problem of inferring a clustering of a data set has been the subject of much research in Bayesian analysis, and there currently exists a solid mathematical foundation for Bayesian approaches to clustering. In particular, the class of probability distributions over partitions of a data set has been characterized in a number of ways, including via exchangeable partition probability functions (EPPFs) and the Kingman paintbox. Here, we develop a generalization of the clustering problem, called feature allocation, where we allow each data point to belong to an arbitrary, non-negative integer number of groups, now called features or topics. We define and study an “exchangeable feature probability function” (EFPF)—analogous to the EPPF in the clustering setting—for certain types of feature models. Moreover, we introduce a “feature paintbox” characterization—analogous to the Kingman paintbox for clustering—of the class of exchangeable feature models. We provide a further characterization of the subclass of feature allocations that have EFPF representations.

Article information

Source
Bayesian Anal., Volume 8, Number 4 (2013), 801-836.

Dates
First available in Project Euclid: 4 December 2013

Permanent link to this document
https://projecteuclid.org/euclid.ba/1386166314

Digital Object Identifier
doi:10.1214/13-BA823

Mathematical Reviews number (MathSciNet)
MR3150470

Zentralblatt MATH identifier
1329.62278

Keywords
feature feature allocation paintbox EFPF feature frequency model Indian buffet process beta process

Citation

Broderick, Tamara; Pitman, Jim; Jordan, Michael I. Feature Allocations, Probability Functions, and Paintboxes. Bayesian Anal. 8 (2013), no. 4, 801--836. doi:10.1214/13-BA823. https://projecteuclid.org/euclid.ba/1386166314


Export citation

References

  • Aldous, D. (1985). “Exchangeability and related topics.” Ecole d’Eté de Probabilités de Saint-Flour XIII–1983, 1–198.
  • Broderick, T., Jordan, M. I., and Pitman, J. (2012a). “Beta processes, stick-breaking, and power laws.” Bayesian Analysis, 7(2): 439–476.
  • — (2012b). “Clusters and features from combinatorial stochastic processes.” Statistical Science, to appear. Arxiv preprint arXiv:1206.5862.
  • Broderick, T., Mackey, L., Paisley, J., and Jordan, M. I. (2011). “Combinatorial clustering and the beta negative binomial process.” Arxiv preprint arXiv:1111.1802.
  • Chen, S. X. and Liu, J. S. (1997). “Statistical applications of the Poisson-binomial and conditional Bernoulli distributions.” Statistica Sinica, 7: 875–892.
  • De Finetti, B. (1931). “Funzione caratteristica di un fenomeno aleatorio.” Atti della R. Academia Nazionale dei Lincei, Serie 6., 4: 251–299. In Italian.
  • Doshi, F., Miller, K. T., Van Gael, J., and Teh, Y. W. (2009). “Variational inference for the Indian buffet process.” In Proceedings of the International Conference on Artificial Intelligence and Statistics. Clearwater Beach, Florida, USA.
  • Escobar, M. D. (1994). “Estimating normal means with a Dirichlet process prior.” Journal of the American Statistical Association, 268–277.
  • Griffiths, T. and Ghahramani, Z. (2006). “Infinite latent feature models and the Indian buffet process.” In Advances in Neural Information Processing Systems. Vancouver, B.C., Canada.
  • Hewitt, E. and Savage, L. J. (1955). “Symmetric measures on Cartesian products.” Transactions of the American Mathematical Society, 80(2): 470–501.
  • Johnson, O., Kontoyiannis, I., and Madiman, M. (2011). “Log-concavity, ultra-log-concavity, and a maximum entropy property of discrete compound Poisson measures.” Discrete Applied Mathematics.
  • Kallenberg, O. (2002). Foundations of Modern Probability. Springer.
  • Kingman, J. F. C. (1978). “The representation of partition structures.” Journal of the London Mathematical Society, 2(2): 374.
  • Pitman, J. (1995). “Exchangeable and partially exchangeable random partitions.” Probability Theory and Related Fields, 102(2): 145–158.
  • — (2006). Combinatorial Stochastic Processes, volume 1875 of Lecture Notes in Mathematics. Berlin: Springer-Verlag. URL http://bibserver.berkeley.edu/csp/april05/bookcsp.pdf
  • Teh, Y. and Görür, D. (2009). “Indian buffet processes with power-law behavior.” In Advances in Neural Information Processing Systems. Vancouver, B.C., Canada.
  • Thibaux, R. and Jordan, M. (2007). “Hierarchical beta processes and the Indian buffet process.” In Proceedings of the International Conference on Artificial Intelligence and Statistics. San Juan, Puerto Rico.
  • Wang, Y. (1993). “On the number of successes in independent trials.” Statistica Sinica, 3(2): 295–312.
  • Zhou, M., Hannah, L., Dunson, D., and Carin, L. (2012). “Beta-negative binomial process and Poisson factor analysis.” In Proceedings of the International Conference on Artificial Intelligence and Statistics. La Palma, Canary Islands.