## Statistical Science

### The Geometry of Continuous Latent Space Models for Network Data

#### Abstract

We review the class of continuous latent space (statistical) models for network data, paying particular attention to the role of the geometry of the latent space. In these models, the presence/absence of network dyadic ties are assumed to be conditionally independent given the dyads’ unobserved positions in a latent space. In this way, these models provide a probabilistic framework for embedding network nodes in a continuous space equipped with a geometry that facilitates the description of dependence between random dyadic ties. Specifically, these models naturally capture homophilous tendencies and triadic clustering, among other common properties of observed networks. In addition to reviewing the literature on continuous latent space models from a geometric perspective, we highlight the important role the geometry of the latent space plays on properties of networks arising from these models via intuition and simulation. Finally, we discuss results from spectral graph theory that allow us to explore the role of the geometry of the latent space, independent of network size. We conclude with conjectures about how these results might be used to infer the appropriate latent space geometry from observed networks.

#### Article information

Source
Statist. Sci., Volume 34, Number 3 (2019), 428-453.

Dates
First available in Project Euclid: 11 October 2019

https://projecteuclid.org/euclid.ss/1570780978

Digital Object Identifier
doi:10.1214/19-STS702

Mathematical Reviews number (MathSciNet)
MR4017522

Zentralblatt MATH identifier
07162131

#### Citation

Smith, Anna L.; Asta, Dena M.; Calder, Catherine A. The Geometry of Continuous Latent Space Models for Network Data. Statist. Sci. 34 (2019), no. 3, 428--453. doi:10.1214/19-STS702. https://projecteuclid.org/euclid.ss/1570780978

#### References

• Abbe, E., Bandeira, A. S. and Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Trans. Inform. Theory 62 471–487.
• Abu-Ata, M. and Dragan, F. F. (2016). Metric tree-like structures in real-world networks: An empirical study. Networks 67 49–68.
• Airoldi, E. M., Blei, D. M., Fienberg, S. E., Goldenberg, A., Xing, E. P. and Zheng, A. X. (2008a). Statistical Network Analysis: Models, Issues, and New Directions: ICML 2006 Workshop on Statistical Network Analysis, Pittsburgh, PA, USA, June 29, 2006, Revised Selected Papers 4503. Springer, Berlin.
• Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008b). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981–2014.
• Aitchison, J., Barceló-Vidal, C., Martín-Fernández, J. and Pawlowsky-Glahn, V. (2000). Logratio analysis and compositional distance. Mathematical Geology 32 271–275.
• Aldecoa, R., Orsini, C. and Krioukov, D. (2015). Hyperbolic graph generator. Comput. Phys. Commun. 196 492–496.
• Amini, A. A., Chen, A., Bickel, P. J. and Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. Ann. Statist. 41 2097–2122.
• Asta, D. and Shalizi, C. R. (2014). Geometric network comparison. Under review. Preprint. Available at arXiv:1411.1350.
• Barabási, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286 509–512.
• Bartholomew, D., Knott, M. and Moustaki, I. (2011). Latent Variable Models and Factor Analysis: A Unified Approach, 3rd ed. Wiley Series in Probability and Statistics. Wiley, Chichester.
• Belkin, M. and Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In Learning Theory. Lecture Notes in Computer Science 3559 486–500. Springer, Berlin.
• Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman–Girvan and other modularities. Proc. Natl. Acad. Sci. USA. 106 21068–21073.
• Butts, C. T. (2008). Network: A package for managing relational data in R. J. Stat. Softw. 24.
• Butts, C. T. (2016). sna: Tools for Social Network Analysis. R package version 2.4. https://CRAN.R-project.org/package=sna.
• Celinska, D. and Kopczynski, E. (2017). Programming languages in GitHub: A visualization in hyperbolic plane. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media (ICWSM 2017) 727–728. Association for the Advancement of Artificial Intelligence. Menlo Park, CA.
• Chen, K. and Lei, J. (2018). Network cross-validation for determining the number of communities in network data. J. Amer. Statist. Assoc. 113 241–251.
• Clauset, A., Moore, C. and Newman, M. E. J. (2008). Hierarchical structure and the prediction of missing links in networks. Nature 453 98–101.
• Clauset, A., Newman, M. E. and Moore, C. (2004). Finding community structure in very large networks. Phys. Rev. E 70 066111.
• Coleman, J. S. (1964). Introduction to Mathematical Sociology. Free Press, New York.
• Csardi, G. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal Complex Systems 1695.
• Davis, J. A. (1970). Clustering and hierarchy in interpersonal relations: Testing two graph theoretical models on 742 sociomatrices. Am. Sociol. Rev. 843–851.
• Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33–61.
• Doveton, H. (1998). Beyond the perfect martini: Teaching the mathematics of petrological logs. In Proceedings of IAMG98, The Fourth Annual Conference of the International Association for Mathematical Geology 71–75. ACM Press/Addison-Wesley Co., De Frede, Naples.
• Eash, R., Chon, K., Lee, Y. and Boyce, D. (1979). Equilibrium traffic assignment on an aggregated highway network for sketch planning. Transportation Research 13 243–257.
• Erdős, P. and Rényi, A. (1960). On the evolution of random graphs. Magy. Tud. Akad. Mat. Kut. Intéz. Közl. 5 17–61.
• Fienberg, S. E. and Wasserman, S. S. (1981). Categorical data analysis of single sociometric relations. Sociol. Method. 12 156–192.
• Frank, O. and Strauss, D. (1986). Markov graphs. J. Amer. Statist. Assoc. 81 832–842.
• Gao, C., Lu, Y. and Zhou, H. H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624–2652.
• Gilbert, E. N. (1959). Random graphs. Ann. Math. Stat. 30 1141–1144.
• Goodreau, S. M., Kitts, J. A. and Morris, M. (2009). Birds of a feather, or friend of a friend? Using exponential random graph models to investigate adolescent social networks. Demography 46 103–125.
• Handcock, M. S., Raftery, A. E. and Tantrum, J. M. (2007). Model-based clustering for social networks. J. Roy. Statist. Soc. Ser. A 170 301–354.
• Hein, M., Audibert, J.-Y. and von Luxburg, U. (2007). Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8 1325–1368.
• Hoff, P. D. (2005). Bilinear mixed-effects models for dyadic data. J. Amer. Statist. Assoc. 100 286–295.
• Hoff, P. (2008). Modeling homophily and stochastic equivalence in symmetric relational data. In Advances in Neural Information Processing Systems 657–664.
• Hoff, P. D. (2009). Multiplicative latent factor models for description and prediction of social networks. Computational and Mathematical Organization Theory 15 261–272.
• Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090–1098.
• Holland, P. W. and Leinhardt, S. (1970). A method for detecting structure in sociometric data. Amer. J. Sociol. 492–513.
• Holland, P. W. and Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. J. Amer. Statist. Assoc. 76 33–65.
• Holly, J. E. (2001). Pictures of ultrametric spaces, the $p$-adic numbers, and valued fields. Amer. Math. Monthly 108 721–728.
• Ibragimov, Z. (2014). A hyperbolic filling for ultrametric spaces. Comput. Methods Funct. Theory 14 315–329.
• Ivriĭ, V. Ja. (1980). The second term of the spectral asymptotics for a Laplace–Beltrami operator on manifolds with boundary. Funktsional. Anal. i Prilozhen. 14 25–34.
• James, C. (1990). Foundations of Social Theory. Belknap, Cambridge, MA.
• Kolaczyk, E. D. and Csárdi, G. (2014). Statistical Analysis of Network Data with R. Use R! Springer, New York.
• Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A. and Boguñá, M. (2010). Hyperbolic geometry of complex networks. Phys. Rev. E (3) 82 036106, 18.
• Krivitsky, P. N., Handcock, M. S., Raftery, A. E. and Hoff, P. D. (2009). Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models. Soc. Netw. 31 204–213.
• Kunegis, J. (2013). KONECT—The Koblenz Network Collection.
• Lamping, J., Rao, R. and Pirolli, P. (1995). A focus+ context technique based on hyperbolic geometry for visualizing large hierarchies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 401–408. ACM Press/Addison-Wesley, New York.
• Lazarsfeld, P. F., Henry, N. W. and Anderson, T. W. (1968). Latent Structure Analysis 109. Houghton Mifflin, Boston.
• Lichnerowicz, A. (1958). Geometrie des groupes de transformations, Dunod, Paris, 1958. Zentralblatt MATH 96.
• Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society Colloquium Publications 60. Amer. Math. Soc., Providence, RI.
• McCormick, T. H. and Zheng, T. (2015). Latent surface models for networks using aggregated relational data. J. Amer. Statist. Assoc. 110 1684–1695.
• Minhas, S., Hoff, P. D. and Ward, M. D. (2016). Inferential approaches for network analyses: AMEN for latent factor models. Preprint. Available at arXiv:1611.00460.
• Munzner, T. (1997). H3: Laying out large directed graphs in 3D hyperbolic space. In IEEE Symposium on Information Visualization, 1997. Proceedings 2–10. IEEE Press, New York.
• Newman, M. E. and Girvan, M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E 69 026113.
• Nickel, C. L. M. (2009). Random dot product graphs a model for social networks. Ph.D. dissertation. Johns Hopkins Univ., Baltimore, MD.
• Opsahl, T. (2009). Structure and evolution of weighted networks. Ph.D thesis, Queen Mary, Univ. London.
• Padgett, J. F. (1994). Marriage and Elite Structure in Renaissance Florence, 12821500. Social Science History Association.
• Pao, H., Coppersmith, G. A. and Priebe, C. E. (2011). Statistical inference on random graphs: Comparative power analyses via Monte Carlo. J. Comput. Graph. Statist. 20 395–416.
• Pattison, P. and Robins, G. (2002). Neighborhood-based models for social networks. Sociol. Method. 32 301–337.
• R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
• Rapoport, A. (1953). Spread of information through a population with socio-structural bias. I. Assumption of transitivity. Bull. Math. Biophys. 15 523–533.
• Robins, G., Pattison, P., Kalish, Y. and Lusher, D. (2007). An introduction to exponential random graph (p∗) models for social networks. Soc. Netw. 29 173–191.
• Rogue, Z. and Rogue, T. (2011). HyperRogue. Available at http://www.roguetemple.com/z/hyper/.
• Saldaña, D. F., Yu, Y. and Feng, Y. (2017). How many communities are there? J. Comput. Graph. Statist. 26 171–181.
• Sarkar, P. and Moore, A. W. (2006). Dynamic social network analysis using latent space models. In Advances in Neural Information Processing Systems 1145–1152.
• Schweinberger, M. and Snijders, T. A. (2003). Settings in social networks: A measurement model. Sociol. Method. 33 307–341.
• Sewell, D. K. and Chen, Y. (2015). Latent space models for dynamic networks. J. Amer. Statist. Assoc. 110 1646–1657.
• Simmel, G. (1950). The sociology of Georg Simmel. In Individual and Society. 3–84. Free Press, New York.
• Smith, A. (2017). Statistical methodology for multiple networks. PhD thesis, The Ohio State Univ., Columbus, OH.
• Snijders, T. A. (2002). Markov chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure 3 1–40.
• Snijders, T. A. B. and Nowicki, K. (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J. Classification 14 75–100.
• Snijders, T. A., Pattison, P. E., Robins, G. L. and Handcock, M. S. (2006). New specifications for exponential random graph models. Sociol. Method. 36 99–153.
• Solé, R. V. and Valverde, S. (2004). Information theory of complex networks: On evolution and architectural constraints. In Complex Networks. Lecture Notes in Physics 650 189–207. Springer, Berlin.
• Spearman, C. (1904). “General intelligence,” objectively determined and measured. The American Journal of Psychology 15 201–292.
• Ting, D., Huang, L. and Jordan, M. (2011). An analysis of the convergence of graph Laplacians. Preprint. Available at arXiv:1101.5435.
• van der Linden, W. J. and Hambleton, R. K., eds. (2013). Handbook of Modern Item Response Theory. Springer, New York.
• van Duijn, M. A. J., Snijders, T. A. B. and Zijlstra, B. J. H. (2004). $p_{2}$: A random effects model with covariates for directed graphs. Stat. Neerl. 58 234–254.
• Wasserman, S. and Pattison, P. (1996). Logit models and logistic regressions for social networks. I. An introduction to Markov graphs and $p$. Psychometrika 61 401–425.
• Watts, D. J. (1999a). Networks, dynamics, and the small-world phenomenon. Amer. J. Sociol. 105 493–527.
• Watts, D. J. and Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature 393 440–442.
• Weyl, H. (1911). Sur la distribution asymptotique des valeurs propres. Nouvelles de la Société des Sciences sur Göttingen, Mathematical-Physical Class 110–117.
• Wolfe, P. J. and Olhede, S. C. (2013). Nonparametric graphon estimation. Preprint. Available at arXiv:1309.5936.
• Young, S. J. and Scheinerman, E. R. (2007). Random dot product graph models for social networks. In Algorithms and Models for the Web-Graph. Lecture Notes in Computer Science 4863 138–149. Springer, Berlin.
• Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research 33 452–473.