Electronic Journal of Statistics

Empirical Bayes estimation for the stochastic blockmodel

Shakira Suwan, Dominic S. Lee, Runze Tang, Daniel L. Sussman, Minh Tang, and Carey E. Priebe

Full-text: Open access

Abstract

Inference for the stochastic blockmodel is currently of burgeoning interest in the statistical community, as well as in various application domains as diverse as social networks, citation networks, brain connectivity networks (connectomics), etc. Recent theoretical developments have shown that spectral embedding of graphs yields tractable distributional results; in particular, a random dot product latent position graph formulation of the stochastic blockmodel informs a mixture of normal distributions for the adjacency spectral embedding. We employ this new theory to provide an empirical Bayes methodology for estimation of block memberships of vertices in a random graph drawn from the stochastic blockmodel, and demonstrate its practical utility. The posterior inference is conducted using a Metropolis-within-Gibbs algorithm. The theory and methods are illustrated through Monte Carlo simulation studies, both within the stochastic blockmodel and beyond, and experimental results on a Wikipedia graph are presented.

Article information

Source
Electron. J. Statist., Volume 10, Number 1 (2016), 761-782.

Dates
Received: April 2015
First available in Project Euclid: 22 March 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1458655726

Digital Object Identifier
doi:10.1214/16-EJS1115

Mathematical Reviews number (MathSciNet)
MR3477741

Zentralblatt MATH identifier
1333.62088

Subjects
Primary: 62F15: Bayesian inference
Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Keywords
Adjacency spectral graph embedding Bayesian inference random dot product graph model stochastic blockmodel

Citation

Suwan, Shakira; Lee, Dominic S.; Tang, Runze; Sussman, Daniel L.; Tang, Minh; Priebe, Carey E. Empirical Bayes estimation for the stochastic blockmodel. Electron. J. Statist. 10 (2016), no. 1, 761--782. doi:10.1214/16-EJS1115. https://projecteuclid.org/euclid.ejs/1458655726


Export citation

References

  • Airoldi, E., D. Blei, S. Fienberg, and E. Xing (2008). Mixed membership stochastic blockmodels., Journal of Machine Learning Research. 9, 1981–2014.
  • Airoldi, E. M., T. B. Costa, and S. H. Chan (2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger (Eds.), Advances in Neural Information Processing Systems 26, pp. 692–700.
  • Athreya, A., C. Priebe, M. Tang, V. Lyzinski, D. Marchette, and D. Sussman (2015). A limit theorem for scaled eigenvectors of random dot product graphs., Sankhya A.
  • Bickel, P. and A. Chen (2009). A nonparametric view of network models and Newman–Girvan and other modularities., Proceedings of the National Academy of Sciences 106(50), 21068–21073.
  • Bickel, P., D. Choi, X. Chang, and H. Zhang (2013). Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels., The Annals of Statistics 41(4), 1922–1943.
  • Celisse, A., J.-J. Daudin, and L. Pierre (2012). Consistency of maximum-likelihood and variational estimators in the stochastic block model., Electronic Journal of Statistics 6, 1847–1899.
  • Choi, D. S., P. J. Wolfe, and E. M. Airoldi (2012). Stochastic blockmodels with a growing number of classes., Biometrika 99(2), 273–284.
  • Fienberg, S. E. (2010). Introduction to papers on the modeling and analysis of network data., Ann. Appl. Statist 4, 1–4.
  • Fienberg, S. E. (2012). A brief history of statistical models for network analysis and open challenges., Journal of Computational and Graphical Statistics 21(4), 825–839.
  • Fishkind, D. E., D. L. Sussman, M. Tang, J. T. Vogelstein, and C. E. Priebe (2013). Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown., SIAM Journal on Matrix Analysis and Applications 34(1), 23–39.
  • Fraley, C. and A. E. Raftery (2002). Model-based clustering, discriminant analysis, and density estimation., Journal of the American Statistical Association 97, 611–631.
  • Goldenberg, A., A. X. Zheng, S. E. Fienberg, and E. M. Airoldi (2010). A survey of statistical network models., Foundations and Trends in Machine Learning 2(2), 129–233.
  • Handcock, M., A. Raftery, and J. Tantrum (2007). Model-based clustering for social networks., Journal of the Royal Statistical Society. Series A (Statistics in Society) 170(2), pp. 301–354.
  • Hoff, P., A. Raftery, and M. Handcock (2002). Latent space approaches to social network analysis., Journal of the american Statistical association 97(460), 1090–1098.
  • Holland, P. W., K. B. Laskey, and S. Leinhardt (1983). Stochastic blockmodels: First steps., Social Networks 5(2), 109–137.
  • Lyzinski, V., D. Sussman, M. Tang, A. Athreya, and C. Priebe (2014). Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding., Electronic Journal of Statistics.
  • Newman, M. E. (2006). Modularity and community structure in networks., Proceedings of the National Academy of Sciences 103(23), 8577–8582.
  • Nickel, C. (2006)., Random dot product graphs: A model for social networks. Ph. D. thesis, Johns Hopkins University.
  • Nowicki, K. and T. Snijders (2001). Estimation and prediction for stochastic blockstructures., Journal of the American Statistical Association 96(455), pp. 1077–1087.
  • Olhede, S. C. and P. J. Wolfe (2013). Network histograms and universality of blockmodel approximation., arXiv preprint arXiv:1312.5306.
  • Rohe, K., S. Chatterjee, and B. Yu (2011). Spectral clustering and the high-dimensional stochastic blockmodel., The Annals of Statistics 39(4), 1878–1915.
  • Snijders, T. and K. Nowicki (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure., Journal of Classification 14(1), 75–100.
  • Sussman, D. L. (2014)., Foundations of Adjacency Spectral Embedding. Ph. D. thesis, Johns Hopkins University.
  • Sussman, D. L., M. Tang, D. E. Fishkind, and C. E. Priebe (2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs., Journal of the American Statistical Association 107(499), 1119–1128.
  • Sussman, D. L., M. Tang, and C. E. Priebe (2014). Consistent latent position estimation and vertex classification for random dot product graphs., Pattern Analysis and Machine Intelligence, IEEE Transactions on 36(1), 48–57.
  • Tang, M., D. L. Sussman, and C. E. Priebe (2013). Universally consistent vertex classification for latent positions graphs., The Annals of Statistics 41(3), 1406–1430.
  • Young, S. and E. Scheinerman (2007). Random dot product graph models for social networks., Algorithms and models for the web-graph, 138–149.