Bayesian Analysis

Beta Processes, Stick-Breaking and Power Laws

Tamara Broderick, Michael I. Jordan, and Jim Pitman

Full-text: Open access

Abstract

The beta-Bernoulli process provides a Bayesian nonparametric prior for models involving collections of binary-valued features. A draw from the beta process yields an infinite collection of probabilities in the unit interval, and a draw from the Bernoulli process turns these into binary-valued features. Recent work has provided stick-breaking representations for the beta process analogous to the well-known stick-breaking representation for the Dirichlet process. We derive one such stick-breaking representation directly from the characterization of the beta process as a completely random measure. This approach motivates a three-parameter generalization of the beta process, and we study the power laws that can be obtained from this generalized beta process. We present a posterior inference algorithm for the beta-Bernoulli process that exploits the stick-breaking representation, and we present experimental results for a discrete factor-analysis model.

Article information

Source
Bayesian Anal., Volume 7, Number 2 (2012), 439-476.

Dates
First available in Project Euclid: 16 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1339878895

Digital Object Identifier
doi:10.1214/12-BA715

Mathematical Reviews number (MathSciNet)
MR2934958

Zentralblatt MATH identifier
1330.62218

Keywords
beta process stick-breaking power law

Citation

Broderick, Tamara; Jordan, Michael I.; Pitman, Jim. Beta Processes, Stick-Breaking and Power Laws. Bayesian Anal. 7 (2012), no. 2, 439--476. doi:10.1214/12-BA715. https://projecteuclid.org/euclid.ba/1339878895


Export citation

References

  • Blei, D. M. and Jordan, M. I. (2006). “Variational inference for Dirichlet process mixtures.” Bayesian Analysis, 1(1): 121–144.
  • Chernoff, H. (1952). “A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations.” The Annals of Mathematical Statistics, 493–507.
  • Feller, W. (1966). An Introduction to Probability Theory and Its Applications, Vol. II. New York: John Wiley.
  • Ferguson, T. S. (1973). “A Bayesian analysis of some nonparametric problems.” The Annals of Statistics, 1(2): 209–230.
  • Franceschetti, M., Dousse, O., Tse, D. N. C., and Thiran, P. (2007). “Closing the gap in the capacity of wireless networks via percolation theory.” Information Theory, IEEE Transactions on, 53(3): 1009–1018.
  • Freedman, D. (1973). “Another note on the Borel-Cantelli lemma and the strong law, with the Poisson approximation as a by-product.” The Annals of Probability, 1(6): 910–925.
  • Gnedin, A., Hansen, B., and Pitman, J. (2007). “Notes on the occupancy problem with infinitely many boxes: General asymptotics and power laws.” Probability Surveys, 4: 146–171.
  • Goldwater, S., Griffiths, T., and Johnson, M. (2006). “Interpolating between types and tokens by estimating power-law generators.” In Advances in Neural Information Processing Systems, 18. Cambridge, MA: MIT Press.
  • Griffiths, T. and Ghahramani, Z. (2006). “Infinite latent feature models and the Indian buffet process.” In Advances in Neural Information Processing Systems, 18, volume 18. Cambridge, MA: MIT Press.
  • Hagerup, T. and Rub, C. (1990). “A guided tour of Chernoff bounds.” Information Processing Letters, 33(6): 305–308.
  • Heaps, H. S. (1978). Information Retrieval: Computational and Theoretical Aspects. Orlando, FL: Academic Press.
  • Hjort, N. L. (1990). “Nonparametric Bayes estimators based on beta processes in models for life history data.” The Annals of Statistics, 18(3): 1259–1294.
  • Ishwaran, H. and James, L. F. (2001). “Gibbs sampling methods for stick-breaking priors.” Journal of the American Statistical Association, 96(453): 161–173.
  • Kalli, M., Griffin, J. E., and Walker, S. G. (2009). “Slice sampling mixture models.” Statistics and Computing, 21: 93–105.
  • Kim, Y. and Lee, J. (2001). “On posterior consistency of survival models.” Annals of Statistics, 666–686.
  • Kingman, J. F. C. (1967). “Completely random measures.” Pacific Journal of Mathematics, 21(1): 59–78.
  • — (1993). Poisson Processes. Oxford University Press.
  • LeCun, Y. and Cortes, C. (1998). “The MNIST database of handwritten digits.” URL http://yann.lecun.com/exdb/mnist/
  • MacEachern, S. N. (1999). “Dependent nonparametric processes.” In ASA Proceedings of the Section on Bayesian Statistical Science, 50–55.
  • McCloskey, J. W. (1965). “A model for the distribution of individuals by species in an environment.” Ph.D. thesis, Michigan State University.
  • Mitzenmacher, M. (2004). “A brief history of generative models for power law and lognormal distributions.” Internet Mathematics, 1(2): 226–251.
  • Paisley, J., Blei, D., and Jordan, M. I. (2011). “The stick-breaking construction of the beta process as a Poisson process.” Pre-print arXiv:1109.0343v1 [math.ST].
  • Paisley, J., Zaas, A., Woods, C. W., Ginsburg, G. S., and Carin, L. (2010). “A stick-breaking construction of the beta process.” In International Conference on Machine Learning. Haifa, Israel.
  • Patil, G. P. and Taillie, C. (1977). “Diversity as a concept and its implications for random communities.” In Proceedings of the 41st Session of the International Statistical Institute, 497–515. New Delhi.
  • Pitman, J. (2006). Combinatorial stochastic processes, volume 1875 of Lecture Notes in Mathematics. Berlin: Springer-Verlag. URL http://bibserver.berkeley.edu/csp/april05/bookcsp.pdf
  • Pitman, J. and Yor, M. (1997). “The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator.” The Annals of Probability, 25(2): 855–900.
  • Roweis, S. (2007). “MNIST handwritten digits.” URL http://www.cs.nyu.edu/~roweis/data.html
  • Sethuraman, J. (1994). “A constructive definition of Dirichlet priors.” Statistica Sinica, 4(2): 639–650.
  • Teh, Y. W. and Görür, D. (2009). “Indian buffet processes with power-law behavior.” In Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
  • Teh, Y. W., Görür, D., and Ghahramani, Z. (2007). “Stick-breaking construction for the Indian buffet process.” In Proceedings of the International Conference on Artificial Intelligence and Statistics, 11. San Juan, Puerto Rico.
  • Thibaux, R. and Jordan, M. I. (2007). “Hierarchical beta processes and the Indian buffet process.” In International Conference on Artificial Intelligence and Statistics. San Juan, Puerto Rico.
  • Tricomi, F. G. and Erdélyi, A. (1951). “The asymptotic expansion of a ratio of gamma functions.” Pacific Journal of Mathematics, 1(1): 133–142.
  • Walker, S. G. (2007). “Sampling the Dirichlet mixture model with slices.” Communications in Statistics—Simulation and Computation, 36(1): 45–54.
  • Wolpert, R. L. and Ickstadt, K. (2004). “Reflecting uncertainty in inverse problems: A Bayesian solution using Lévy processes.” Inverse Problems, 20: 1759–1771.
  • Zipf, G. K. (1949). Human Behaviour and the Principle of Least-Effort. Addison-Wesley.