Electronic Journal of Statistics

Exponential-family random graph models for valued networks

Pavel N. Krivitsky

Full-text: Open access

Abstract

Exponential-family random graph models (ERGMs) provide a principled and flexible way to model and simulate features common in social networks, such as propensities for homophily, mutuality, and friend-of-a-friend triad closure, through choice of model terms (sufficient statistics). However, those ERGMs modeling the more complex features have, to date, been limited to binary data: presence or absence of ties. Thus, analysis of valued networks, such as those where counts, measurements, or ranks are observed, has necessitated dichotomizing them, losing information and introducing biases.

In this work, we generalize ERGMs to valued networks. Focusing on modeling counts, we formulate an ERGM for networks whose ties are counts and discuss issues that arise when moving beyond the binary case. We introduce model terms that generalize and model common social network features for such data and apply these methods to a network dataset whose values are counts of interactions.

Article information

Source
Electron. J. Statist. Volume 6 (2012), 1100-1128.

Dates
First available in Project Euclid: 22 June 2012

Permanent link to this document
http://projecteuclid.org/euclid.ejs/1340369356

Digital Object Identifier
doi:10.1214/12-EJS696

Mathematical Reviews number (MathSciNet)
MR2988440

Zentralblatt MATH identifier
06166989

Subjects
Primary: 91D30: Social networks
Secondary: 60B05: Probability measures on topological spaces

Keywords
p-star model transitivity weighted network count data maximum likelihood estimation Conway–Maxwell–Poisson distribution

Citation

Krivitsky, Pavel N. Exponential-family random graph models for valued networks. Electron. J. Statist. 6 (2012), 1100--1128. doi:10.1214/12-EJS696. http://projecteuclid.org/euclid.ejs/1340369356.


Export citation

References

  • Barndorff-Nielsen, O. E. (1978)., Information and Exponential Families in Statistical Theory. John Wiley & Sons, Inc., New York.
  • Batagelj, V. and Mrvar, A. (2006). Pajek datasets. Available at, http://vlado.fmf.uni-lj.si/pub/networks/data/.
  • Bernard, H. R., Killworth, P. D. and Sailer, L. (1979–1980). Informant accuracy in social network data IV: A comparison of clique-level structure in behavioral and cognitive network data., Social Networks 2 191–218.
  • Besag, J. (1974). Spatial Interaction and the Statistical Analysis of Lattice Systems (with Discussion)., Journal of the Royal Statistical Society, Series B 36 192–236.
  • Brown, L. D. (1986)., Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory. Lecture Notes — Monograph Series 9. Institute of Mathematical Statistics, Hayward, California.
  • Diesner, J. and Carley, K. M. (2005). Exploration of communication networks from the Enron email corpus. In, Proceedings of Workshop on Link Analysis, Counterterrorism and Security, SIAM International Conference on Data Mining 2005 21–23.
  • Faust, K. (2007). Very Local Structure in Social Networks., Sociological Methodology 37 209–256.
  • Frank, O. and Strauss, D. (1986). Markov Graphs., Journal of the American Statistical Association 81 832–842.
  • Freeman, L. C. and Freeman, S. C. (1980). A semi-visible college: Structural effects of seven months of EIES participation by a social networks community. In, Electronic Communication: Technology and Impacts (M. M. Henderson and M. J. McNaughton, eds.). AAAS Symposium 52 77–85. American Association for Advancement of Science, Washington, D.C.
  • Geyer, C. J. (1999). Likelihood Inference for Spatial Point Processes. In, Stochastic Geometry: Likelihood and Computation, (O. E. Barndorff-Nielsen, W. S. Kendall and M.-C. N. M. van Lieshout, eds.). Monographs on Statistics and Applied Probability 80 79–141. Chapman & Hall/CRC Press, Boca Raton, Florida.
  • Geyer, C. J. and Thompson, E. A. (1992). Constrained Monte Carlo Maximum Likelihood for Dependent Data (with discussion)., Journal of the Royal Statistical Society. Series B 54 657–699.
  • Goldenberg, A., Zheng, A. X., Fienberg, S. E. and Airoldi, E. M. (2009). A survey of statistical network models., Foundations and Trends in Machine Learning 2 129–233.
  • Goodreau, S. M., Kitts, J. and Morris, M. (2008). Birds of a Feather, or Friend of a Friend? Using Exponential Random Graph Models to Investigate Adolescent Social Networks., Demography 45 103–125.
  • Goodreau, S. M., Handcock, M. S., Hunter, D. R., Butts, C. T. and Morris, M. (2008). A statnet Tutorial., Journal of Statistical Software 24 1–26.
  • Handcock, M. S. (2003). Assessing Degeneracy in Statistical Models of Social Networks Working Paper report No. 39, Center for Statistics and the Social Sciences, University of Washington, Seattle, WA.
  • Handcock, M. S. (2006). Statistical Exponential-Family Models for Signed Networks. Unpublished, manuscript.
  • Handcock, M. S. and Gile, K. J. (2010). Modeling Social Networks from Sampled Data., Annals of Applied Statistics 4 5–25.
  • Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M., Krivitsky, P. N. and Morris, M. (2012). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks Version 3.0-1. The Statnet Project, http://www.statnet.org.
  • Hanneke, S., Fu, W. and Xing, E. P. (2010). Discrete Temporal Models of Social Networks., Electronic Journal of Statistics 4 585–605.
  • Harris, K. M., Florey, F., Tabor, J., Bearman, P. S., Jones, J. and Udry, J. R. (2003). The National Longitudinal Study of Adolescent Health: Research Design Technical Report, University of North, Carolina.
  • Hoff, P. D. (2005). Bilinear Mixed Effects Models for Dyadic Data., Journal of the American Statistical Association 100 286–295.
  • Holland, P. W. and Leinhardt, S. (1981). An Exponential Family of Probability Distributions for Directed Graphs., Journal of the American Statistical Association 76 33–65.
  • Hunter, D. R., Goodreau, S. M. and Handcock, M. S. (2008). Goodness of Fit for Social Network Models., Journal of the American Statistical Association 103 248–258.
  • Hunter, D. R. and Handcock, M. S. (2006). Inference in Curved Exponential Family Models for Networks., Journal of Computational and Graphical Statistics 15 565–583.
  • Hunter, D. R., Handcock, M. S., Butts, C. T., Goodreau, S. M. and Morris, M. (2008). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks., Journal of Statistical Software 24 1–29.
  • Kelly, F. P. and Ripley, B. D. (1976). A Note on Strauss’s Model for Clustering., Biometrika 63 357–360.
  • Krivitsky, P. N. and Handcock, M. S. (2010). A Separable Model for Dynamic Networks., Under review.
  • Krivitsky, P. N., Handcock, M. S. and Morris, M. (2011). Adjusting for Network Size and Composition Effects in Exponential-Family Random Graph Models., Statistical Methodology 8 319–339.
  • Krivitsky, P. N., Handcock, M. S., Raftery, A. E. and Hoff, P. D. (2009). Representing Degree Distributions, Clustering, and Homophily in Social Networks with Latent Cluster Random Effects Models., Social Networks 31 204–213.
  • Lambert, D. (1992). Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing., Technometrics 34 1–14.
  • Lazega, E. and Pattison, P. E. (1999). Multiplexity, generalized exchange and cooperation in organizations: a case study., Social Networks 21 67–90.
  • Mariadassou, M., Robin, S. and Vacher, C. (2010). Uncovering Latent Structure in Valued Graphs: A Variational Approach., Annals of Applied Statistics 4 715–742.
  • McCullagh, P. and Nelder, J. A. (1989)., Generalized Linear Models, Second ed. Monographs on Statistics and Applied Probability 37. Chapman & Hall/CRC.
  • Morris, M., Handcock, M. S. and Hunter, D. R. (2008). Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects., Journal of Statistical Software 24 1–24.
  • Morris, M. and Kretzschmar, M. (1997). Concurrent Partnerships and the Spread of HIV., AIDS 11 641–648.
  • Newcomb, T. M. (1961)., The Acquaintance Process. Holt, Rinehart, Winston, New York.
  • Pattison, P. and Wasserman, S. (1999). Logit Models and Logistic Regressions for Social Networks: II. Multivariate Relations., British Journal of Mathematical and Statistical Psychology 52 169–193.
  • Read, K. E. (1954). Cultures of the central highlands, New Guinea., Southwestern Journal of Anthropology 10 1–43.
  • Rinaldo, A., Fienberg, S. E. and Zhou, Y. (2009). On the Geometry of Discrete Exponential Families with Application to Exponential Random Graph Models., Electronic Journal of Statistics 3 446–484.
  • Robbins, H. and Monro, S. (1951). A Stochastic Approximation Method., The Annals of Mathematical Statistics 22 400–407.
  • Robins, G., Pattison, P. and Wasserman, S. S. (1999). Logit Models and Logistic Regressions for Social Networks: III. Valued Relations., Psychometrika 64 371–394.
  • Robins, G. and Pattison, P. (2001). Random graph models for temporal processes in social networks., Journal of Mathematical Sociology 25 5–41.
  • Sampson, S. F. (1968). A Novitiate in a Period of Change: An Experimental and Case Study of Social Relationships Ph.D. thesis (University Micofilm, No 69-5775), Department of Sociology, Cornell University, Ithaca, New, York.
  • Schweinberger, M. (2011). Instability, Sensitivity, and Degeneracy of Discrete Exponential Families., Journal of the American Statistical Association 0 1-10.
  • Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S. and Boatwright, P. (2005). A Useful Distribution for Fitting Discrete Data: Revival of the Conway–Maxwell–Poisson Distribution., Journal of the Royal Statistical Society: Series C 54 127 –142.
  • Snijders, T. A. B. (2002). Markov chain Monte Carlo Estimation of Exponential Random Graph Models., Journal of Social Structure 3.
  • Snijders, T. A. B., van de Bunt, G. G. and Steglich, C. E. G. (2010). Introduction to Stochastic Actor-Based Models for Network Dynamics., Social Networks 32 44–60.
  • Snijders, T. A. B., Pattison, P. E., Robins, G. L. and Handcock, M. S. (2006). New specifications for exponential random graph models., Sociological Methodology 36 99–153.
  • Strauss, D. and Ikeda, M. (1990). Pseudolikelihood Estimation for Social Networks., Journal of the American Statistical Association 85 204–212.
  • Thomas, A. C. and Blitzstein, J. K. (2011). Valued Ties Tell Fewer Lies: Why Not To Dichotomize Network Edges With, Thresholds.
  • van Duijn, M. A. J., Snijders, T. A. B. and Zijlstra, B. J. H. (2004). $p_2$: a random effects model with covariates for directed graphs., Statistica Neerlandica 58 234–254.
  • Ward, M. D. and Hoff, P. D. (2007). Persistent Patterns of International Commerce., Journal of Peace Research 44 157.
  • Westveld, A. H. and Hoff, P. D. (2011). A mixed effects model for longitudinal relational and network data, with applications to international trade and conflict., Annals of Applied Statistics 5 843–872.
  • Wyatt, D., Choudhury, T. and Bilmes, J. (2009). Dynamic Multi-Valued Network Models for Predicting Face-to-Face Conversations In, NIPS-09 workshop on Analyzing Networks and Learning with Graphs. Neural Information Processing Systems (NIPS).
  • Wyatt, D., Choudhury, T. and Blimes, J. (2010). Discovering Long Range Properties of Social Networks with Multi-Valued Time-Inhomogeneous Models. In, Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10). Association for the Advancement of Artificial Intelligence.
  • Zachary, W. W. (1977). An Information Flow Model for Conflict and Fission in Small Groups., Journal of Anthropological Research 33 452–473.