The Annals of Applied Statistics

A state-space mixed membership blockmodel for dynamic network tomography

Eric P. Xing, Wenjie Fu, and Le Song

Full-text: Open access

Abstract

In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper we propose a model-based approach to analyze what we will refer to as the dynamic tomography of such time-evolving networks. Our approach offers an intuitive but powerful tool to infer the semantic underpinnings of each actor, such as its social roles or biological functions, underlying the observed network topologies. Our model builds on earlier work on a mixed membership stochastic blockmodel for static networks, and the state-space model for tracking object trajectory. It overcomes a major limitation of many current network inference techniques, which assume that each actor plays a unique and invariant role that accounts for all its interactions with other actors; instead, our method models the role of each actor as a time-evolving mixed membership vector that allows actors to behave differently over time and carry out different roles/functions when interacting with different peers, which is closer to reality. We present an efficient algorithm for approximate inference and learning using our model; and we applied our model to analyze a social network between monks (i.e., the Sampson’s network), a dynamic email communication network between the Enron employees, and a rewiring gene interaction network of fruit fly collected during its full life cycle. In all cases, our model reveals interesting patterns of the dynamic roles of the actors.

Article information

Source
Ann. Appl. Stat. Volume 4, Number 2 (2010), 535-566.

Dates
First available in Project Euclid: 3 August 2010

Permanent link to this document
http://projecteuclid.org/euclid.aoas/1280842130

Digital Object Identifier
doi:10.1214/09-AOAS311

Mathematical Reviews number (MathSciNet)
MR2758639

Zentralblatt MATH identifier
1194.62133

Citation

Xing, Eric P.; Fu, Wenjie; Song, Le. A state-space mixed membership blockmodel for dynamic network tomography. Ann. Appl. Stat. 4 (2010), no. 2, 535--566. doi:10.1214/09-AOAS311. http://projecteuclid.org/euclid.aoas/1280842130.


Export citation

References

  • Ahmed, A. and Xing, E. P. (2007). On tight approximate inference of logistic-normal admixture model. In Proceedings of the Eleventh International Conference on Artifical Intelligence and Statistics. Omnipress, Madison, WI.
  • Airoldi, E., Blei, D., Xing, E. P. and Fienberg, S. (2005). A latent mixed membership model for relational data. In Proceedings of Workshop on Link Discovery: Issues, Approaches and Applications (LinkKDD-2005), The Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Chicago, IL.
  • Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008). Mixed membership stochastic blockmodel. J. Mach. Learn. Res. 9 1981–2014.
  • Aitchison, J. (1986). The Statistical Analysis of Compositional Data. Chapman & Hall, New York.
  • Aitchison, J. and Shen, S. M. (1980). Logistic-normal distributions: Some properties and uses. Biometrika 67 261–272.
  • Barabasi, A. L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286 509–512.
  • Blei, D. and Lafferty, J. (2006a). Correlated topic models. In Advances in Neural Information Processing Systems 18. MIT Press, Boston, MA.
  • Blei, D. M. and Lafferty, J. D. (2006b). Dynamic topic models. In ICML’06: Proceedings of the 23rd International Conference on Machine Learning 113–120. ACM Press, New York.
  • Blei, D. M., Jordan, M. I. and Ng, A. Y. (2003). Hierarchical Bayesian models for applications in information retrieval. In Bayesian Statistics 7 (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 25–44. Oxford Univ. Press.
  • Blei, D. M., Ng, A. and Jordan, M. I. (2003). Latent Dirichlet allocation. J. Mach. Learn. Res. 3 993–1022.
  • Breiger, R., Boorman, S. and Arabie, P. (1975). An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling. J. Math. Psych. 12 328–383.
  • Erosheva, E. and Fienberg, S. E. (2005). Bayesian mixed membership models for soft clustering and classification. In Classification—The Ubiquitous Challenge (C. Weihs and W. Gaul, eds.) 11–26. Springer, New York.
  • Erosheva, E. A., Fienberg, S. E. and Lafferty, J. (2004). Mixed-membership models of scientific publications. Proc. Natl. Acad. Sci. 97 11885–11892.
  • Fienberg, S. E., Meyer, M. M. and Wasserman, S. (1985). Statistical analysis of multiple sociometric relations. J. Amer. Statist. Assoc. 80 51–67.
  • Frank, O. and Strauss, D. (1986). Markov graphs. J. Amer. Statist. Assoc. 81 832–842.
  • Ghahramani, Z. and Beal, M. J. (2001). Propagation algorithms for variational Bayesian learning. In Advances in Neural Information Processing Systems 13. MIT Press, Boston, MA.
  • Handcock, M. S., Raftery, A. E. and Tantrum, J. M. (2007). Model-based clustering for social networks. J. Roy. Statist. Soc. Ser. A 170 1–22.
  • Hoff, P. D. (2003). Bilinear mixed effects models for dyadic data. Technical Report 32, Univ. Washington, Seattle.
  • Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090–1098.
  • Holland, P., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: Some first steps. Social Networks 5 109–137.
  • Kleinberg, J. (2000). Navigation in a small world. Nature 406 845.
  • Kolar, M., Song, L., Ahmed, A. and Xing, E. P. (2010). Estimating time-varying networks. Ann. Appl. Statist. 4 94–123.
  • Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J. and Glance, N. (2007). Cost-effective outbreak detection in networks. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, New York.
  • Leskovec, J., Lang, K., Dasgupta, A. and Mahoney, M. (2008). Statistical properties of community structure in large social and information networks. In Proc. 17th International Conference on World Wide Web. ACM Press, New York.
  • Li, W. and McCallum, A. (2006). Pachinko allocation: Dag-structured mixture models of topic correlations. In ICML’06: Proceedings of the 23rd International Conference on Machine Learning 577–584. ACM Press, New York.
  • Lorrain, F. and White, H. C. (1971). Structural equivalence of individuals in social networks. J. Math. Soc. 1 49–80.
  • Moody, J. and White, D. R. (2003). Structural cohesion and embeddedness: A hierarchical concept of social groups. Amer. Soc. Rev. 68 103–127.
  • Pritchard, J., Stephens, M. and Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics 155 945–959.
  • Sampson, S. (1969). Crisis in a cloister. Unpublished doctoral dissertation, Cornell Univ.
  • Sarkar, P. and Moore, A. W. (2005). Dynamic social network analysis using latent space models. SIGKDD Explor. Newsl. 7 31–40.
  • Shetty, J. and Adibi, J. (2004). The Enron email dataset database schema and brief statistical report. Technical report, Information Sciences Institute, Univ. Southern California.
  • Snijders, T. A. B. (2002). Markov chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure 3.
  • Vardi, Y. (1996). Network tomography: Estimating source-destination traffic intensities from link data. J. Amer. Statist. Assoc. 91 365–377.
  • Wang, X. and McCallum, A. (2006). Topics over time: A non-Markov continuous-time model of topical trends. In KDD’06: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 424–433. ACM Press, New York.
  • Wasserman, S. and Pattison, P. (1996). Logit models and logistic regression for social networks: I. An introduction to Markov graphs and p*. Psychometrika 61 401–425.
  • White, H. C., Boorman, S. A. and Breiger, R. L. (1976). Social structure from multiple networks. I. Blockmodels of roles and positions. Amer. J. Soc. 81 730.
  • Xing, E. P., Jordan, M. I. and Russell, S. (2003). A generalized mean field algorithm for variational inference in exponential families. In Proceedings of the 19th Annual Conference on Uncertainty in AI. Morgan Kaufmann, San Francisco, CA.