The Annals of Statistics
- Ann. Statist.
- Volume 45, Number 2 (2017), 500-528.
Likelihood-based model selection for stochastic block models
Y. X. Rachel Wang and Peter J. Bickel
Abstract
The stochastic block model (SBM) provides a popular framework for modeling community structures in networks. However, more attention has been devoted to problems concerning estimating the latent node labels and the model parameters than the issue of choosing the number of blocks. We consider an approach based on the log likelihood ratio statistic and analyze its asymptotic properties under model misspecification. We show the limiting distribution of the statistic in the case of underfitting is normal and obtain its convergence rate in the case of overfitting. These conclusions remain valid when the average degree grows at a polylog rate. The results enable us to derive the correct order of the penalty term for model complexity and arrive at a likelihood-based model selection criterion that is asymptotically consistent. Our analysis can also be extended to a degree-corrected block model (DCSBM). In practice, the likelihood function can be estimated using more computationally efficient variational methods or consistent label estimation algorithms, allowing the criterion to be applied to large networks.
Article information
Source
Ann. Statist. Volume 45, Number 2 (2017), 500-528.
Dates
Received: October 2015
Revised: February 2016
First available in Project Euclid: 16 May 2017
Permanent link to this document
http://projecteuclid.org/euclid.aos/1494921948
Digital Object Identifier
doi:10.1214/16-AOS1457
Subjects
Primary: 62F05: Asymptotic properties of tests
Keywords
Stochastic block models model misspecification network communities likelihood ratio statistic
Citation
Wang, Y. X. Rachel; Bickel, Peter J. Likelihood-based model selection for stochastic block models. Ann. Statist. 45 (2017), no. 2, 500--528. doi:10.1214/16-AOS1457. http://projecteuclid.org/euclid.aos/1494921948.
Supplemental materials
- Supplement to “Likelihood-based model selection for stochastic block models”. A proof sketch of how the main results in the paper can be extended to the DCSBM described in Section 2.5 is provided in the supplement.Digital Object Identifier: doi:10.1214/16-AOS1457SUPPSupplemental files available for subscribers.

