The Annals of Statistics

Community detection in degree-corrected block models

Chao Gao, Zongming Ma, Anderson Y. Zhang, and Harrison H. Zhou

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Community detection is a central problem of network data analysis. Given a network, the goal of community detection is to partition the network nodes into a small number of clusters, which could often help reveal interesting structures. The present paper studies community detection in Degree-Corrected Block Models (DCBMs). We first derive asymptotic minimax risks of the problem for a misclassification proportion loss under appropriate conditions. The minimax risks are shown to depend on degree-correction parameters, community sizes and average within and between community connectivities in an intuitive and interpretable way. In addition, we propose a polynomial time algorithm to adaptively perform consistent and even asymptotically optimal community detection in DCBMs.

Article information

Source
Ann. Statist., Volume 46, Number 5 (2018), 2153-2185.

Dates
Received: July 2016
Revised: July 2017
First available in Project Euclid: 17 August 2018

Permanent link to this document
https://projecteuclid.org/euclid.aos/1534492832

Digital Object Identifier
doi:10.1214/17-AOS1615

Mathematical Reviews number (MathSciNet)
MR3845014

Zentralblatt MATH identifier
06964329

Subjects
Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20] 91D30: Social networks
Secondary: 62C20: Minimax procedures 90B15: Network models, stochastic

Keywords
Clustering Minimax rates network analysis spectral clustering stochastic block model

Citation

Gao, Chao; Ma, Zongming; Zhang, Anderson Y.; Zhou, Harrison H. Community detection in degree-corrected block models. Ann. Statist. 46 (2018), no. 5, 2153--2185. doi:10.1214/17-AOS1615. https://projecteuclid.org/euclid.aos/1534492832


Export citation

References

  • [1] Abbe, E., Bandeira, A. S. and Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Trans. Inform. Theory 62 471–487. https://doi.org/10.1109/TIT.2015.2490670.
  • [2] Abbe, E. and Sandon, C. (2015). Community detection in general stochastic block models: Fundamental limits and efficient algorithms for recovery. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science—FOCS 2015 670–688. IEEE Computer Soc., Los Alamitos, CA.
  • [3] Adamic, L. A. and Glance, N. (2005). The political blogosphere and the 2004 us election: Divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery 36–43. ACM, New York.
  • [4] Amini, A. A., Chen, A., Bickel, P. J. and Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. Ann. Statist. 41 2097–2122. https://doi.org/10.1214/13-AOS1138.
  • [5] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman–Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068–21073.
  • [6] Chen, Y., Li, X. and Xu, J. (2015). Convexified modularity maximization for degree-corrected stochastic block models. Preprint. Available at arXiv:1512.08425.
  • [7] Chin, P., Rao, A. and Vu, V. (2015). Stochastic block model and community detection in sparse graphs: A spectral algorithm with optimal rate of recovery. In Proceedings of the 28th Conference on Learning Theory 391–423.
  • [8] Dasgupta, A., Hopcroft, J. E. and McSherry, F. (2004). Spectral analysis of random graphs with skewed degree distributions. In Foundations of Computer Science, 2004. Proceedings. 45th Annual IEEE Symposium on 602–610. IEEE, New York.
  • [9] Decelle, A., Krzakala, F., Moore, C. and Zdeborová, L. (2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84 066106.
  • [10] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2015). Achieving optimal misclassification proportion in stochastic block model. Preprint. Available at arXiv:1505.03772.
  • [11] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2018). Supplement to “Community detection in degree-corrected block models.” DOI:10.1214/17-AOS1615SUPP.
  • [12] Gulikers, L., Lelarge, M. and Massoulié, L. (2015). An impossibility result for reconstruction in a degree-corrected planted-partition model. Preprint. Available at arXiv:1511.00546.
  • [13] Gulikers, L., Lelarge, M. and Massoulié, L. (2015). A spectral method for community detection in moderately-sparse degree-corrected stochastic block models. Preprint. Available at arXiv:1506.08621.
  • [14] Hajek, B., Wu, Y. and Xu, J. (2014). Achieving exact cluster recovery threshold via semidefinite programming. Preprint. Available at arXiv:1412.6156.
  • [15] Hajek, B., Wu, Y. and Xu, J. (2015). Achieving exact cluster recovery threshold via semidefinite programming: Extensions. Preprint. Available at arXiv:1502.07738.
  • [16] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109–137. https://doi.org/10.1016/0378-8733(83)90021-7.
  • [17] Jin, J. (2015). Fast community detection by SCORE. Ann. Statist. 43 57–89.
  • [18] Karrer, B. and Newman, M. E. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107.
  • [19] Lei, J., Rinaldo, A. et al. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43 215–237.
  • [20] Massoulié, L. (2014). Community detection thresholds and the weak Ramanujan property. In STOC’14—Proceedings of the 2014 ACM Symposium on Theory of Computing 694–703. ACM, New York.
  • [21] Mossel, E., Neeman, J. and Sly, A. (2012). Stochastic block models and reconstruction. Preprint. Available at arXiv:1202.1499.
  • [22] Mossel, E., Neeman, J. and Sly, A. (2013). A proof of the block model threshold conjecture. Preprint. Available at arXiv:1311.4115.
  • [23] Mossel, E., Neeman, J. and Sly, A. (2014). Consistency thresholds for binary symmetric block models. Preprint. Available at arXiv:1407.1591.
  • [24] Peixoto, T. P. (2015). Model selection and hypothesis testing for large-scale network models with overlapping groups. Phys. Rev. E 5 011033.
  • [25] Qin, T. and Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In Advances in Neural Information Processing Systems 3120–3128.
  • [26] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878–1915.
  • [27] Zhang, A. Y. and Zhou, H. H. (2016). Minimax rates of community detection in stochastic block models. Ann. Statist. 44 2252–2280.
  • [28] Zhao, Y., Levina, E. and Zhu, J. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Statist. 40 2266–2292.

Supplemental materials

  • Supplement to “Community detection in degree-corrected block models.”. The supplement [11] presents additional numerical results, additional proofs of main results, properties of $J_{t}(p,q)$ and proofs of auxiliary results.