Electronic Journal of Statistics

Statistical models for cores decomposition of an undirected random graph

Vishesh Karwa, Michael J. Pelsmajer, Sonja Petrović, Despina Stasi, and Dane Wilburne

Full-text: Open access

Abstract

The $k$-core decomposition is a widely studied summary statistic that describes a graph’s global connectivity structure. In this paper, we move beyond using $k$-core decomposition as a tool to summarize a graph and propose using $k$-core decomposition as a tool to model random graphs. We propose using the shell distribution vector, a way of summarizing the decomposition, as a sufficient statistic for a family of exponential random graph models. We study the properties and behavior of the model family, implement a Markov chain Monte Carlo algorithm for simulating graphs from the model, implement a direct sampler from the set of graphs with a given shell distribution, and explore the sampling distributions of some of the commonly used complementary statistics as good candidates for heuristic model fitting. These algorithms provide first fundamental steps necessary for solving the following problems: parameter estimation in this ERGM, extending the model to its Bayesian relative, and developing a rigorous methodology for testing goodness of fit of the model and model selection. The methods are applied to a synthetic network as well as the well-known Sampson monks dataset.

Article information

Source
Electron. J. Statist., Volume 11, Number 1 (2017), 1949-1982.

Dates
Received: October 2015
First available in Project Euclid: 16 May 2017

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1494900119

Digital Object Identifier
doi:10.1214/17-EJS1235

Mathematical Reviews number (MathSciNet)
MR3651020

Zentralblatt MATH identifier
1386.05178

Rights
Creative Commons Attribution 4.0 International License.

Citation

Karwa, Vishesh; Pelsmajer, Michael J.; Petrović, Sonja; Stasi, Despina; Wilburne, Dane. Statistical models for cores decomposition of an undirected random graph. Electron. J. Statist. 11 (2017), no. 1, 1949--1982. doi:10.1214/17-EJS1235. https://projecteuclid.org/euclid.ejs/1494900119


Export citation

References

  • José Ignacio Alvarez-Hamelin, Luca Dall’Asta, Alain Barrat, and Alessandro Vespignani. k-core decomposition: a tool for the visualization of large scale networks. In, Advances in Neural Information Processing Systems 18, page 41. MIT Press, 2006.
  • Joonhyun Bae and Sangwook Kim. Identifying and ranking influential spreaders in complex networks by neighborhood coreness., Physica A: Statistical Mechanics and its Applications, 395:549–559, 2014.
  • Michael J. Bannister, William E. Devanny, and David Eppstein. ERGMs are Hard. Preprint, available at arxiv:, arXiv:1412.1787 [cs.DS].
  • Vladimir Batagelj and Andrej Mrvar. Pajek datasets. URL, http://vlado.fmf.uni-lj.si/pub/networks/data/.
  • Vladimir Batagelj and Matjaž Zaveršnik. An O(m) algorithm for cores decomposition of networks., CoRR, cs.DS /0310049, 2003.
  • Michael Baur, Marco Gaertler, Robert Görke, Marcus Krug, and Dorothea Wagner. Generating graphs with predefined $k$-core structure., Proceedings of the European Conference of Complex Systems, 2007.
  • Francesho Bonchi, Franceso Gullo, Andreas Kaltenbrunner, and Yana Volkovich. Core decomposition of uncertain graphs., Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014.
  • Lawrence Brown., Fundamentals of Statistical Exponential Families, volume 9 of Monograph Series. IMS Lecture Notes, 1986.
  • Alberto Caimo and Nial Friel. Bayesian inference for exponential random graph models., Social Networks, 33(1):41–55, 2011.
  • Shai Carmi, Shlomo Havlin, Scott Kirkpatrick, Yuval Shavitt, and Eran Shir. A model of internet topology using k-shell decomposition., Proceedings of the National Academy of Sciences, USA, 104 :11150–11154, 2007.
  • Sourav Chatterjee and Persi Diaconis. Estimating and understanding exponential random graph models., The Annals of Statistics, 41(5) :2428–2461, 2013.
  • Sourav Chatterjee, Persi Diaconis, and Allan Sly. Random graphs with a given degree sequence., Ann. Appl. Probab., 21(4) :1400–1435, 2011.
  • Gabor Csardi and Tamas Nepusz. The igraph software package for complex network research., InterJournal, Complex Systems :1695, 2006.
  • Marius Eidsaa and Eivind Almaas. $s$-core network decomposition: A generalization of $k$-core analysis to weights., Physical Review, 88(6) :062819, 2013.
  • Charles J. Geyer and Elizabeth A. Thompson. Constrained monte carlo maximum likelihood for dependent data., Journal of the Royal Statistical Society. Series B (Methodological), pages 657–699, 1992.
  • Christos Giatsidis, Dimitrios M. Thilikos, and Michalis Varzigiannis. D-cores: measuring collaboration of directed graphs based on degeneracy., Knowledge and Information Systems, 35(2):311–343, 2013.
  • Anna Goldenberg, Alice X. Zheng, Stephen E. Fienberg, and Edoardo M. Airoldi. A survey of statistical network models., Foundations and Trends in Machine Learning, 2(2):129–233, 2009.
  • Steven M. Goodreau, James A. Kitts, and Martina Morris. Birds of a feather, or friend of a friend? using exponential random graph models to investigate adolescent social networks∗., Demography, 46(1):103–125, 2009.
  • Paul W. Holland and Samuel Leinhardt. An exponential family of probability distributions for directed graphs., Journal of the American Statistical Association, 76(373):33–50, 1981.
  • Ruth M. Hummel, David R. Hunter, and Mark S. Handcock. Improving simulation-based algorithms for fitting ergms., Journal of Computational and Graphical Statistics, 21(4):920–939, 2012.
  • David R. Hunter and Mark S. Handcock. Inference in curved exponential family models for networks., Journal of Computational and Graphical Statistics, 15(3), 2006.
  • David R. Hunter, Steven M. Goodreau, and Mark S. Handcock. Goodness of fit of social network models., Journal of the American Statistical Association, 103(481), 2008.
  • Maksim Kitsak, Lazaros K. Gallos, Shlomo Havlin, Fredrik Liljeros, Lev Muchnik, Eugene H. Stanley, and Hernán A. Makse. Identification of influential spreaders in complex networks., Nature Physics, 6(11):888–893, 2010.
  • Michael M. Lee, Indrajit Roy, Alvin AuYoung, Vanish Talwar, K.R. Jayaram, and Yuanyuan Zhou. Views and transactional storage for large graphs., Middleware, pages 287–306, 2013.
  • Daniele Miorandi and Frencesco De Pellegrini. K-shell decomposition for dynamic complex networks., Modeling and Optimization in Mobile Ad Hoc and Wireless Networks WiOpt 2010 Proceedings of the 8th International Proceedings on, pages 488–496, 2010.
  • Sofia C. Olhede and Patrick Wolfe. Degree-based network models. Preprint, arXiv :1211.6537, 2012.
  • Sen Pei, Lev Muchnik, Jose Andrade Jr., Zhiming Zheng, and Hernán Maske. Searching for superspreaders if information in real-world social media., Nature Scientific Reports, 4, 2012.
  • Alessandro Rinaldo, Stephen E. Fienberg, and Yi Zhou. On the geometry of discrete exponential families with application to exponential random graph models., Electronic Journal of Statistics, 3:446–484, 2009.
  • Alessandro Rinaldo, Sonja Petrović, and Stephen E. Fienberg. Maximum lilkelihood estimation in the $\beta$-model., The Annals of Statistics, 41(3) :1085–1110, 2013.
  • Garry Robins, Pip Pattison, Yuval Kalish, and Dean Lusher. An introduction to exponential random graph (p∗) models for social networks., Social networks, 29(2):173–191, 2007.
  • M. Puck Rombach, Mason A. Porter, James H. Fowler, and Peter J. Mucha. Core-periphery structure in networks., SIAM Journal of Applied Math, 74(1):167–190, 2014.
  • Kayvan Sadeghi and Alessandro Rinaldo. Statistical models for degree distributions of networks., NIPS Workshop, 2014.
  • Samuel Franklin Sampson., A novitiate in a period of change: An experimental and case study of social relationships. PhD thesis, Cornell University, September, 1968.
  • Zachary M. Saul and Vladimir Filkov. Exploring biological network structure using exponential random graph models., Bioinformatics, 23(19) :2604–2611, 2007.
  • Michael Schweinberger. Instability, sensitivity, and degeneracy of discrete exponential families., Journal of the American Statistical Association, 106(496) :1361–1370, 2011.
  • Stephen B. Seidman. Network structure and minimum degree., Social Networks, 5(3):269–287, 1983.
  • Cosma Rohilla Shalizi, Alessandro Rinaldo, et al. Consistency under sampling of exponential random graph models., The Annals of Statistics, 41(2):508–535, 2013.
  • Tom A.B. Snijders. Markov chain Monte Carlo estimation of exponential random graph models., Journal of Social Structure, 3(2):1–40, 2002.
  • Tom A.B. Snijders and Marijtje A.J. Van Duijn. Conditional maximum likelihood estimation under various specifications of exponential random graph models., Contributions to social network analysis, information theory, and other topics in statistics, pages 117–134, 2002.
  • Tom A.B. Snijders, Philippa E. Pattison, Garry L. Robins, and Mark S. Handcock. New specifications for exponential random graph models., Sociological methodology, 36(1):99–153, 2006.
  • Stefan Wuchty and Eivind Almaas. Peeling the yeast protein network., Proteomics, 5(2):444–449, 2005.