## Annals of Statistics

### Community detection in degree-corrected block models

#### Abstract

Community detection is a central problem of network data analysis. Given a network, the goal of community detection is to partition the network nodes into a small number of clusters, which could often help reveal interesting structures. The present paper studies community detection in Degree-Corrected Block Models (DCBMs). We first derive asymptotic minimax risks of the problem for a misclassification proportion loss under appropriate conditions. The minimax risks are shown to depend on degree-correction parameters, community sizes and average within and between community connectivities in an intuitive and interpretable way. In addition, we propose a polynomial time algorithm to adaptively perform consistent and even asymptotically optimal community detection in DCBMs.

#### Article information

Source
Ann. Statist., Volume 46, Number 5 (2018), 2153-2185.

Dates
Revised: July 2017
First available in Project Euclid: 17 August 2018

https://projecteuclid.org/euclid.aos/1534492832

Digital Object Identifier
doi:10.1214/17-AOS1615

Mathematical Reviews number (MathSciNet)
MR3845014

Zentralblatt MATH identifier
06964329

#### Citation

Gao, Chao; Ma, Zongming; Zhang, Anderson Y.; Zhou, Harrison H. Community detection in degree-corrected block models. Ann. Statist. 46 (2018), no. 5, 2153--2185. doi:10.1214/17-AOS1615. https://projecteuclid.org/euclid.aos/1534492832

#### References

• [1] Abbe, E., Bandeira, A. S. and Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Trans. Inform. Theory 62 471–487. https://doi.org/10.1109/TIT.2015.2490670.
• [2] Abbe, E. and Sandon, C. (2015). Community detection in general stochastic block models: Fundamental limits and efficient algorithms for recovery. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science—FOCS 2015 670–688. IEEE Computer Soc., Los Alamitos, CA.
• [3] Adamic, L. A. and Glance, N. (2005). The political blogosphere and the 2004 us election: Divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery 36–43. ACM, New York.
• [4] Amini, A. A., Chen, A., Bickel, P. J. and Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. Ann. Statist. 41 2097–2122. https://doi.org/10.1214/13-AOS1138.
• [5] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman–Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068–21073.
• [6] Chen, Y., Li, X. and Xu, J. (2015). Convexified modularity maximization for degree-corrected stochastic block models. Preprint. Available at arXiv:1512.08425.
• [7] Chin, P., Rao, A. and Vu, V. (2015). Stochastic block model and community detection in sparse graphs: A spectral algorithm with optimal rate of recovery. In Proceedings of the 28th Conference on Learning Theory 391–423.
• [8] Dasgupta, A., Hopcroft, J. E. and McSherry, F. (2004). Spectral analysis of random graphs with skewed degree distributions. In Foundations of Computer Science, 2004. Proceedings. 45th Annual IEEE Symposium on 602–610. IEEE, New York.
• [9] Decelle, A., Krzakala, F., Moore, C. and Zdeborová, L. (2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84 066106.
• [10] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2015). Achieving optimal misclassification proportion in stochastic block model. Preprint. Available at arXiv:1505.03772.
• [11] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2018). Supplement to “Community detection in degree-corrected block models.” DOI:10.1214/17-AOS1615SUPP.
• [12] Gulikers, L., Lelarge, M. and Massoulié, L. (2015). An impossibility result for reconstruction in a degree-corrected planted-partition model. Preprint. Available at arXiv:1511.00546.
• [13] Gulikers, L., Lelarge, M. and Massoulié, L. (2015). A spectral method for community detection in moderately-sparse degree-corrected stochastic block models. Preprint. Available at arXiv:1506.08621.
• [14] Hajek, B., Wu, Y. and Xu, J. (2014). Achieving exact cluster recovery threshold via semidefinite programming. Preprint. Available at arXiv:1412.6156.
• [15] Hajek, B., Wu, Y. and Xu, J. (2015). Achieving exact cluster recovery threshold via semidefinite programming: Extensions. Preprint. Available at arXiv:1502.07738.
• [16] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109–137. https://doi.org/10.1016/0378-8733(83)90021-7.
• [17] Jin, J. (2015). Fast community detection by SCORE. Ann. Statist. 43 57–89.
• [18] Karrer, B. and Newman, M. E. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107.
• [19] Lei, J., Rinaldo, A. et al. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43 215–237.
• [20] Massoulié, L. (2014). Community detection thresholds and the weak Ramanujan property. In STOC’14—Proceedings of the 2014 ACM Symposium on Theory of Computing 694–703. ACM, New York.
• [21] Mossel, E., Neeman, J. and Sly, A. (2012). Stochastic block models and reconstruction. Preprint. Available at arXiv:1202.1499.
• [22] Mossel, E., Neeman, J. and Sly, A. (2013). A proof of the block model threshold conjecture. Preprint. Available at arXiv:1311.4115.
• [23] Mossel, E., Neeman, J. and Sly, A. (2014). Consistency thresholds for binary symmetric block models. Preprint. Available at arXiv:1407.1591.
• [24] Peixoto, T. P. (2015). Model selection and hypothesis testing for large-scale network models with overlapping groups. Phys. Rev. E 5 011033.
• [25] Qin, T. and Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In Advances in Neural Information Processing Systems 3120–3128.
• [26] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878–1915.
• [27] Zhang, A. Y. and Zhou, H. H. (2016). Minimax rates of community detection in stochastic block models. Ann. Statist. 44 2252–2280.
• [28] Zhao, Y., Levina, E. and Zhu, J. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Statist. 40 2266–2292.

#### Supplemental materials

• Supplement to “Community detection in degree-corrected block models.”. The supplement [11] presents additional numerical results, additional proofs of main results, properties of $J_{t}(p,q)$ and proofs of auxiliary results.