The Annals of Applied Statistics

Modeling social networks from sampled data

Mark S. Handcock and Krista J. Gile
Source: Ann. Appl. Stat. Volume 4, Number 1 (2010), 5-25.

Abstract

Network models are widely used to represent relational information among interacting units and the structural implications of these relations. Recently, social network studies have focused a great deal of attention on random graph models of networks whose nodes represent individual social actors and whose edges represent a specified relationship between the actors.

Most inference for social network models assumes that the presence or absence of all possible links is observed, that the information is completely reliable, and that there are no measurement (e.g., recording) errors. This is clearly not true in practice, as much network data is collected though sample surveys. In addition even if a census of a population is attempted, individuals and links between individuals are missed (i.e., do not appear in the recorded data).

In this paper we develop the conceptual and computational theory for inference based on sampled network information. We first review forms of network sampling designs used in practice. We consider inference from the likelihood framework, and develop a typology of network data that reflects their treatment within this frame. We then develop inference for social network models based on information from adaptive network designs.

We motivate and illustrate these ideas by analyzing the effect of link-tracing sampling designs on a collaboration network.

First Page: Show Hide

Related Works:

Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1273584445
Digital Object Identifier: doi:10.1214/08-AOAS221
Zentralblatt MATH identifier: 1189.62187
Mathematical Reviews number (MathSciNet): MR2758082

References

Barndorff-Nielsen, O. E. (1978). Information and Exponential Families in Statistical Theory. Wiley, New York.
Mathematical Reviews (MathSciNet): MR489333
Zentralblatt MATH: 0387.62011
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192–236.
Mathematical Reviews (MathSciNet): MR373208
Corander, J., Dahmström, K. and Dahmström, P. (1998). Maximum likelihood estimation for Markov graphs. Research report, Dept. Statistics, Univ. Stockholm.
Corander, J., Dahmström, K. and Dahmström, P. (2002). Maximum likelihood estimation for exponential random graph models. In Contributions to Social Network Analysis, Information Theory, and Other Topics in Statistics; A Festschrift in Honour of Ove Frank (J. Hagberg, ed.) 1–17. Dept. Statistics, Univ. Stockholm.
Crouch, B., Wasserman, S. and Trachtenberg, F. (1998). Markov chain Monte Carlo maximum likelihood estimation for p* social network models. In The XVIII International Sunbelt Social Network Conference, Sitges, Spain.
Zentralblatt MATH: 1045.70001
Frank, O. (2005). Network Sampling and Model Fitting. In Models and Methods in Social Network Analysis (J. S. P. Carrington and S. S. Wasserman, eds.) 31–56. Cambridge Univ. Press, Cambridge.
Frank, O. and Strauss, D. (1986). Markov Graphs. J. Amer. Statist. Assoc. 81 832–842.
Mathematical Reviews (MathSciNet): MR860518
Zentralblatt MATH: 0607.05057
Digital Object Identifier: doi:10.1080/01621459.1986.10478342
Geyer, C. J. and Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood calculations (with discussion). J. Roy. Statist. Soc. Ser. B 54 657–699.
Mathematical Reviews (MathSciNet): MR1185217
Handcock, M. S. (2002). Degeneracy and inference for social network models. In The Sunbelt XXII International Social Network Conference, New Orleans, LA.
Handcock, M. S. (2003). Assessing degeneracy in statistical models of social networks. Working paper 39, Center for Statistics and the Social Sciences, Univ. Washington. Available at http://www.csss.washington.edu/Papers.
Handcock, M. S. and Gile, K. J. (2007). Modeling social networks with sampled or missing data. Working paper 75, Center for Statistics and the Social Sciences, Univ. Washington. Available at http://www.csss.washington.edu/Papers.
Handcock, M. S. and Gile, K. J. (2010). Supplement to “Modeling social networks from sampled data.” DOI: 10.1214/08-AOAS221SUPP.
Mathematical Reviews (MathSciNet): MR2758082
Zentralblatt MATH: 1189.62187
Digital Object Identifier: doi:10.1214/08-AOAS221
Project Euclid: euclid.aoas/1273584445
Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M. and Morris, M. (2003). statnet: Software tools for the statistical modeling of network data statnet project http://statnet.org/, Seattle, WA. R package version 2.0. Available at http://CRAN.R-project.org/package=statnet.
Hunter, D. R. and Handcock, M. S. (2006). Inference in curved exponential family models for networks. J. Comput. Graph. Statist. 15 565–583.
Mathematical Reviews (MathSciNet): MR2291264
Digital Object Identifier: doi:10.1198/106186006X133069
Lazega, E. (2001). The Collegial Phenomenon: The Social Mechanisms of Cooperation Among Peers in a Corporate Law Partnership. Oxford Univ. Press, Oxford.
Lehmann, E. L. (1983). Theory of Point Estimation. Wiley, New York, NY.
Mathematical Reviews (MathSciNet): MR702834
R Development Core Team (2007). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, Version 2.6.1. Available at http://www.R-project.org/.
Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
Mathematical Reviews (MathSciNet): MR742974
Zentralblatt MATH: 0522.62091
Digital Object Identifier: doi:10.1093/biomet/70.1.41
Rubin, D. B. (1976). Inference and missing data. Biometrika 63 581–592.
Mathematical Reviews (MathSciNet): MR455196
Zentralblatt MATH: 0344.62034
Digital Object Identifier: doi:10.1093/biomet/63.3.581
Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
Mathematical Reviews (MathSciNet): MR1140409
Snijders, T. A. B. (1992). Estimation on the basis of snowball samples: How to weight. Bulletin Methodologie Sociologique 36 59–70.
Snijders, T. A. B. (2002). Markov chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure 3 1–41.
Snijders, T. A. B., Pattison, P., Robins, G. L. and Handcock, M. S. (2006). New specifications for exponential random graph models. Sociological Methodology 36 99–153.
Zentralblatt MATH: 0294.62073
Strauss, D. and Ikeda, M. (1990). Pseudolikelihood estimation for social networks. J. Amer. Statist. Assoc. 85 204–212.
Mathematical Reviews (MathSciNet): MR1137368
Digital Object Identifier: doi:10.1080/01621459.1990.10475327
Stumpf, M. P. H., Wiuf, C. and May, R. M. (2005). Subnets of scale-free networks are not scale-free: Sampling properties of networks. Proc. Natl. Acad. Sci. USA 102 4221–4224.
Thompson, S. K. and Collins, L. M. (2002). Adaptive sampling in research on risk-related behaviors. Drug and Alcohol Dependence 68 S57–S67.
Thompson, S. K. and Frank, O. (2000). Model-based estimation with link-tracing sampling designs. Survey Methodology 26 87–98.
Thompson, S. K. and Seber, G. A. F. (1996). Adaptive Sampling. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1390995
Wasserman, S. and Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge Univ. Press.
Zentralblatt MATH: 0926.91066

2013 © Institute of Mathematical Statistics

The Annals of Applied Statistics

The Annals of Applied Statistics

Turn MathJax Off
What is MathJax?