Bayesian Analysis

Bayesian Analysis of Dynamic Linear Topic Models

Chris Glynn, Surya T. Tokdar, Brian Howard, and David L. Banks

Full-text: Open access

Abstract

Discovering temporal evolution of themes from a time-stamped collection of text poses a challenging statistical learning problem. Dynamic topic models offer a probabilistic modeling framework to decompose a corpus of text documents into “topics”, i.e., probability distributions over vocabulary terms, while simultaneously learning the temporal dynamics of the relative prevalence of these topics. We extend the dynamic topic model of Blei and Lafferty (2006) by fusing its multinomial factor model on topics with dynamic linear models that account for time trends and seasonality in topic prevalence. A Markov chain Monte Carlo (MCMC) algorithm that utilizes Pólya-Gamma data augmentation is developed for posterior sampling. Conditional independencies in the model and sampling are made explicit, and our MCMC algorithm is parallelized where possible to allow for inference in large corpora. Our model and inference algorithm are validated with multiple synthetic examples, and we consider the applied problem of modeling trends in real estate listings from the housing website Zillow. We demonstrate in synthetic examples that sharing information across documents is critical for accurately estimating document-specific topic proportions. Analysis of the Zillow corpus demonstrates that the method is able to learn seasonal patterns and locally linear trends in topic prevalence.

Article information

Source
Bayesian Anal., Volume 14, Number 1 (2019), 53-80.

Dates
First available in Project Euclid: 14 April 2018

Permanent link to this document
https://projecteuclid.org/euclid.ba/1523671249

Digital Object Identifier
doi:10.1214/18-BA1100

Keywords
topic model dynamic linear model Pólya-Gamma MCMC

Rights
Creative Commons Attribution 4.0 International License.

Citation

Glynn, Chris; Tokdar, Surya T.; Howard, Brian; Banks, David L. Bayesian Analysis of Dynamic Linear Topic Models. Bayesian Anal. 14 (2019), no. 1, 53--80. doi:10.1214/18-BA1100. https://projecteuclid.org/euclid.ba/1523671249


Export citation

References

  • Anandkumar, A., Foster, D. P., Hsu, D. J., Kakade, S. M., and Liu, Y.-K. (2012). “A spectral algorithm for latent dirichlet allocation.” In Advances in Neural Information Processing Systems, 917–925.
  • Anandkumar, A., Hsu, D. J., Janzamin, M., and Kakade, S. M. (2013). “When are overcomplete topic models identifiable? uniqueness of tensor tucker decompositions with structured sparsity.” In Advances in Neural Information Processing Systems, 1986–1994.
  • Arora, S., Ge, R., Halpern, Y., Mimno, D., Moitra, A., Sontag, D., Wu, Y., and Zhu, M. (2013). “A practical algorithm for topic modeling with provable guarantees.” In International Conference on Machine Learning, 280–288.
  • Blei, D. M. and Lafferty, J. D. (2006). “Dynamic topic models.” In Proceedings of the 23rd International Conference on Machine Learning, ICML ’06, 113–120. New York, NY, USA: ACM. URL http://doi.acm.org/10.1145/1143844.1143859
  • Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). “Latent Dirichlet allocation.” Journal of Machine Learning Research, 3: 993–1022. URL http://dl.acm.org/citation.cfm?id=944919.944937
  • Carter, C. K. and Kohn, R. (1994). “On Gibbs sampling for state space models.” Biometrika, 81(3): 541–553.
  • Chen, J., Zhu, J., Wang, Z., Zheng, X., and Zhang, B. (2013). “Scalable inference for logistic-Normal topic models.” In Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z., and Weinberger, K. Q. (eds.), Advances in Neural Information Processing Systems 26, 2445–2453. Curran Associates, Inc.
  • Chib, S. (1996). “Calculating posterior distributions and modal estimates in Markov mixture models.” Journal of Econometrics, 75: 79–97.
  • Devroye, L. (2009). “On exact simulation algorithms for some distributions related to Jacobi theta functions.” Statistics & Probability Letters, 79(21): 2251–2259. URL http://www.sciencedirect.com/science/article/pii/S0167715209002867
  • Donoho, D. and Stodden, V. (2004). “When Does Non-Negative Matrix Factorization Give a Correct Decomposition into Parts?” In Thrun, S., Saul, L. K., and Schölkopf, B. (eds.), Advances in Neural Information Processing Systems 16, 1141–1148. MIT Press.
  • Frühwirth-Schnatter, S. (1994). “Data augmentation and dynamic linear models.” Journal of Time Series Analysis, 15(2): 183–202.
  • Gelman, A. and Rubin, D. B. (1992). “Inference from iterative simulation using multiple sequences.” Statistical Science, 7(4): 457–472. URL http://dx.doi.org/10.1214/ss/1177011136
  • Gerrish, S. and Blei, D. (2011). “DTM.” https://github.com/blei-lab/dtm.
  • Gilks, W. R. and Wild, P. (1992). “Adaptive rejection sampling for Gibbs sampling.” Applied Statistics, 337–348.
  • Gillis, N. (2013). “Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices.” SIAM Journal on Matrix Analysis and Applications, 34(3): 1189–1212.
  • Gillis, N. (2014). “Successive Nonnegative Projection Algorithm for Robust Nonnegative Blind Source Separation.” SIAM Journal on Imaging Sciences, 7(2): 1420–1450.
  • Gillis, N. and Vavasis, S. A. (2014). “Fast and robust recursive algorithmsfor separable nonnegative matrix factorization.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(4): 698–714.
  • Glynn, C., Tokdar, S. T., Howard, B., and Banks, D. L. (2019). “Supplement to “Bayesian Analysis of Dynamic Linear Topic Models”.” Bayesian Analysis.
  • Griffiths, T. L. and Steyvers, M. (2004). “Finding scientific topics.” Proceedings of the National Academy of Sciences, 101(Suppl. 1): 5228–5235.
  • Hoffman, M., Bach, F. R., and Blei, D. M. (2010). “Online learning for latent dirichlet allocation.” In Advances in Neural Information Processing Systems, 856–864.
  • Holmes, C. C. and Held, L. (2006). “Bayesian auxiliary variable models for binary and multinomial regression.” Bayesian Analysis, 1: 145–168.
  • Huang, K., Fu, X., and Sidiropoulos, N. D. (2016). “Anchor-free correlated topic modeling: Identifiability and algorithm.” In Advances in Neural Information Processing Systems, 1786–1794.
  • Kumar, A., Sindhwani, V., and Kambadur, P. (2013). “Fast conical hull algorithms for near-separable non-negative matrix factorization.” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), 231–239.
  • Linderman, S., Johnson, M., and Adams, R. P. (2015). “Dependent multinomial models made easy: Stick-breaking with the Pólya-Gamma augmentation.” In Advances in Neural Information Processing Systems, 3456–3464.
  • Liu, Y.-K., Anandkumar, A., Foster, D. P., Hsu, D., and Kakade, S. M. (2012). “Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation.” In Neural Information Processing Systems (NIPS).
  • Polson, N. G., Scott, J. G., and Windle, J. (2013). “Bayesian inference for logistic models using Polya-Gamma latent variables.” Journal of the American Statistical Association, 108(504): 1339–1349.
  • Recht, B., Re, C., Tropp, J., and Bittorf, V. (2012). “Factoring nonnegative matrices with linear programs.” In Advances in Neural Information Processing Systems, 1214–1222.
  • Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. (2006). “Hierarchical Dirichlet processes.” Journal of the American Statistical Association, 101(476): 1566–1581.
  • Teh, Y. W., Newman, D., and Welling, M. (2007). “A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation.” In Advances in Neural Information Processing Systems, 1353–1360.
  • Turner, R. E. and Sahani, M. (2011). “Two problems with variational expectation maximisation for time-series models.” In Barber, D., Cemgil, T., and Chiappa, S. (eds.), Bayesian Time Series Models, chapter 5, 109–130. Cambridge University Press.
  • Wallach, H. M., Mimno, D. M., and McCallum, A. (2009). “Rethinking LDA: Why priors matter.” In Advances in Neural Information Processing Systems 22, 1973–1981. Curran Associates, Inc. URL http://papers.nips.cc/paper/3854-rethinking-lda-why-priors-matter.pdf
  • Watanabe, S. (2013). “A widely applicable Bayesian information criterion.” Journal of Machine Learning Research, 14(Mar): 867–897.
  • West, M. and Harrison, J. (1997). Bayesian Forecasting and Dynamic Modeling. New York, NY: Springer-Verlag, second edition.
  • Windle, J., Carvalho, C. M., Scott, J. G., and Sun, L. (2013). “Efficient data augmentation in dynamic models for binary and count data.” ArXiv e-prints.
  • Windle, J., Polson, N. G., and Scott, J. G. (2014). “Sampling Polya-Gamma random variates: alternate and approximate techniques.” ArXiv e-prints.

Supplemental materials