The Annals of Statistics

Estimating and understanding exponential random graph models

Sourav Chatterjee and Persi Diaconis

Full-text: Open access

Abstract

We introduce a method for the theoretical analysis of exponential random graph models. The method is based on a large-deviations approximation to the normalizing constant shown to be consistent using theory developed by Chatterjee and Varadhan [European J. Combin. 32 (2011) 1000–1017]. The theory explains a host of difficulties encountered by applied workers: many distinct models have essentially the same MLE, rendering the problems “practically” ill-posed. We give the first rigorous proofs of “degeneracy” observed in these models. Here, almost all graphs have essentially no edges or are essentially complete. We supplement recent work of Bhamidi, Bresler and Sly [2008 IEEE 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (2008) 803–812 IEEE] showing that for many models, the extra sufficient statistics are useless: most realizations look like the results of a simple Erdős–Rényi model. We also find classes of models where the limiting graphs differ from Erdős–Rényi graphs. A limitation of our approach, inherited from the limitation of graph limit theory, is that it works only for dense graphs.

Article information

Source
Ann. Statist., Volume 41, Number 5 (2013), 2428-2461.

Dates
First available in Project Euclid: 5 November 2013

Permanent link to this document
https://projecteuclid.org/euclid.aos/1383661269

Digital Object Identifier
doi:10.1214/13-AOS1155

Mathematical Reviews number (MathSciNet)
MR3127871

Zentralblatt MATH identifier
1293.62046

Subjects
Primary: 62F10: Point estimation 05C80: Random graphs [See also 60B20]
Secondary: 62P25: Applications to social sciences 60F10: Large deviations

Keywords
Random graph Erdős–Rényi graph limit exponential random graph models parameter estimation

Citation

Chatterjee, Sourav; Diaconis, Persi. Estimating and understanding exponential random graph models. Ann. Statist. 41 (2013), no. 5, 2428--2461. doi:10.1214/13-AOS1155. https://projecteuclid.org/euclid.aos/1383661269


Export citation

References

  • [1] Aldous, D. J. (1981). Representations for partially exchangeable arrays of random variables. J. Multivariate Anal. 11 581–598.
  • [2] Aristoff, D. and Radin, C. (2011). Emergent structures in large networks. Preprint. Available at http://arxiv.org/abs/1110.1912.
  • [3] Austin, T. (2008). On exchangeable random variables and the statistics of large graphs and hypergraphs. Probab. Surv. 5 80–145.
  • [4] Austin, T. and Tao, T. (2010). Testability and repair of hereditary hypergraph properties. Random Structures Algorithms 36 373–463.
  • [5] Besag, J. (1975). Statistical analysis of non-lattice data. Statistician 24 179–195.
  • [6] Bhamidi, S., Bresler, G. and Sly, A. (2008). Mixing time of exponential random graphs. In 2008 IEEE 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS) 803–812. IEEE, Washington, DC.
  • [7] Bollobás, B. (2001). Random Graphs, 2nd ed. Cambridge Studies in Advanced Mathematics 73. Cambridge Univ. Press, Cambridge.
  • [8] Bollobás, B. and Riordan, O. (2009). Metrics for sparse graphs. In Surveys in Combinatorics 2009. London Mathematical Society Lecture Note Series 365 211–287. Cambridge Univ. Press, Cambridge.
  • [9] Borgs, C., Chayes, J., Lovász, L., Sós, V. T. and Vesztergombi, K. (2006). Counting graph homomorphisms. In Topics in Discrete Mathematics. Algorithms and Combinatorics 26 315–371. Springer, Berlin.
  • [10] Borgs, C., Chayes, J. T., Lovász, L., Sós, V. T. and Vesztergombi, K. (2008). Convergent sequences of dense graphs. I. Subgraph frequencies, metric properties and testing. Adv. Math. 219 1801–1851.
  • [11] Borgs, C., Chayes, J. T., Lovász, L., Sós, V. T. and Vesztergombi, K. (2012). Convergent sequences of dense graphs II. Multiway cuts and statistical physics. Ann. of Math. (2) 176 151–219.
  • [12] Chatterjee, S. (2007). Estimation in spin glasses: A first step. Ann. Statist. 35 1931–1946.
  • [13] Chatterjee, S. and Dey, P. S. (2010). Applications of Stein’s method for concentration inequalities. Ann. Probab. 38 2443–2485.
  • [14] Chatterjee, S., Diaconis, P. and Sly, A. (2011). Random graphs with a given degree sequence. Ann. Appl. Probab. 21 1400–1435.
  • [15] Chatterjee, S. and Varadhan, S. R. S. (2011). The large deviation principle for the Erdős–Rényi random graph. European J. Combin. 32 1000–1017.
  • [16] Comets, F. and Janžura, M. (1998). A central limit theorem for conditionally centred random fields with an application to Markov fields. J. Appl. Probab. 35 608–621.
  • [17] Corander, J., Dahmström, K. and Dahmström, P. (2002). Maximum likelihood estimation for exponential random graph models. In Contributions to Social Network Analysis, Information Theory and Other Topics in Statistics: A Festschrift in Honour of Ove Frank (J. Hagberg, ed.) 1–17. Dept. Statistics, Univ. Stockholm.
  • [18] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33–61.
  • [19] Erdős, P. and Rényi, A. (1960). On the evolution of random graphs. Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 17–61.
  • [20] Erdös, P. and Stone, A. H. (1946). On the structure of linear graphs. Bull. Amer. Math. Soc. (N.S.) 52 1087–1091.
  • [21] Fienberg, S. E. (2010). Introduction to papers on the modeling and analysis of network data. Ann. Appl. Stat. 4 1–4.
  • [22] Fienberg, S. E. (2010). Introduction to papers on the modeling and analysis of network data—II. Ann. Appl. Stat. 4 533–534.
  • [23] Fortuin, C. M., Kasteleyn, P. W. and Ginibre, J. (1971). Correlation inequalities on some partially ordered sets. Comm. Math. Phys. 22 89–103.
  • [24] Frank, O. and Strauss, D. (1986). Markov graphs. J. Amer. Statist. Assoc. 81 832–842.
  • [25] Freedman, M., Lovász, L. and Schrijver, A. (2007). Reflection positivity, rank connectivity, and homomorphism of graphs. J. Amer. Math. Soc. 20 37–51 (electronic).
  • [26] Frieze, A. and Kannan, R. (1999). Quick approximation to matrices and applications. Combinatorica 19 175–220.
  • [27] Gelfand, I. M. and Fomin, S. V. (2000). Calculus of Variations. Dover, New York.
  • [28] Gelman, A. and Meng, X.-L. (1998). Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statist. Sci. 13 163–185.
  • [29] Geyer, C. J. and Thompson, E. A. (1992). Constrained Monte Carlo maximum likelihood for dependent data. J. R. Stat. Soc. Ser. B Stat. Methodol. 54 657–699.
  • [30] Häggström, O. and Jonasson, J. (1999). Phase transition in the random triangle model. J. Appl. Probab. 36 1101–1115.
  • [31] Handcock, M. S. (2003). Assessing degeneracy in statistical models of social networks. Working Paper 39, Center for Statistics and the Social Sciences, Univ. Washington, Seattle, WA.
  • [32] Holland, P. W. and Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. J. Amer. Statist. Assoc. 76 33–65.
  • [33] Hoover, D. N. (1982). Row-column exchangeability and a generalized model for probability. In Exchangeability in Probability and Statistics (Rome, 1981) 281–291. North-Holland, Amsterdam.
  • [34] Janson, S., Łuczak, T. and Rucinski, A. (2000). Random Graphs. Wiley, New York.
  • [35] Kallenberg, O. (2005). Probabilistic Symmetries and Invariance Principles. Springer, New York.
  • [36] Kou, S. C., Zhou, Q. and Wong, W. H. (2006). Equi-energy sampler with applications in statistical inference and statistical mechanics. Ann. Statist. 34 1581–1652.
  • [37] Lovász, L. (2006). The rank of connection matrices and the dimension of graph algebras. European J. Combin. 27 962–970.
  • [38] Lovász, L. (2007). Connection matrices. In Combinatorics, Complexity, and Chance. Oxford Lecture Series in Mathematics and its Applications 34 179–190. Oxford Univ. Press, Oxford.
  • [39] Lovász, L. and Sós, V. T. (2008). Generalized quasirandom graphs. J. Combin. Theory Ser. B 98 146–163.
  • [40] Lovász, L. and Szegedy, B. (2006). Limits of dense graph sequences. J. Combin. Theory Ser. B 96 933–957.
  • [41] Lovász, L. and Szegedy, B. (2007). Szemerédi’s lemma for the analyst. Geom. Funct. Anal. 17 252–270.
  • [42] Lovász, L. and Szegedy, B. (2009). Contractors and connectors of graph algebras. J. Graph Theory 60 11–30.
  • [43] Lovász, L. and Szegedy, B. (2010). Testing properties of graphs and functions. Israel J. Math. 178 113–156.
  • [44] Lubetzky, E. and Zhao, Y. (2012). On replica symmetry of large deviations in random graphs. Preprint. Available at http://arxiv.org/abs/1210.7013.
  • [45] Park, J. and Newman, M. E. J. (2004). Solution of the two-star model of a network. Phys. Rev. E (3) 70 066146, 5.
  • [46] Park, J. and Newman, M. E. J. (2005). Solution for the properties of a clustered network. Phys. Rev. E (3) 72 026136, 5.
  • [47] Radin, C. and Sadun, L. (2013). Phase transitions in a complex network. J. Phys. A 46 305002.
  • [48] Radin, C. and Yin, M. (2011). Phase transitions in exponential random graphs. Preprint. Available at http://arxiv.org/abs/1108.0649.
  • [49] Rinaldo, A., Fienberg, S. E. and Zhou, Y. (2009). On the geometry of discrete exponential families with application to exponential random graph models. Electron. J. Stat. 3 446–484.
  • [50] Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Statist. 22 400–407.
  • [51] Sanov, I. N. (1961). On the probability of large deviations of random variables. In Select. Transl. Math. Statist. and Probability, Vol. 1 213–244. Amer. Math. Soc., Providence, RI.
  • [52] Snijders, T. A. (2002). Markov chain Monte Carlo estimation of exponential random graph models. J. Soc. Structure 3.
  • [53] Snijders, T. A. B., Pattison, P. E., Robins, G. L. and Handcock, M. S. (2006). New specifications for exponential random graph models. Sociol. Method. 36 99–153.
  • [54] Strauss, D. (1986). On a general class of models for interaction. SIAM Rev. 28 513–527.
  • [55] Talagrand, M. (2003). Spin Glasses: A Challenge for Mathematicians. Ergebnisse der Mathematik und Ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics] 46. Springer, Berlin.
  • [56] Turán, P. (1941). Eine Extremalaufgabe aus der Graphentheorie. Mat. Fiz. Lapok 48 436–452.
  • [57] Wasserman, S. and Faust, K. (2010). Social Network Analysis: Methods and Applications. Structural Analysis in the Social Sciences, 2nd ed. Cambridge Univ. Press, Cambridge.
  • [58] Yin, M. (2012). Critical phenomena in exponential random graphs. Preprint. Available at http://arxiv.org/abs/1208.2992.