Model-based clustering of large networks

Duy Q. Vu; David R. Hunter; Michael Schweinberger

doi:10.1214/12-AOAS617

June 2013 Model-based clustering of large networks

Duy Q. Vu, David R. Hunter, Michael Schweinberger

Ann. Appl. Stat. 7(2): 1010-1039 (June 2013). DOI: 10.1214/12-AOAS617

Abstract

We describe a network clustering framework, based on finite mixture models, that can be applied to discrete-valued networks with hundreds of thousands of nodes and billions of edge variables. Relative to other recent model-based clustering work for networks, we introduce a more flexible modeling framework, improve the variational-approximation estimation algorithm, discuss and implement standard error estimation via a parametric bootstrap approach, and apply these methods to much larger data sets than those seen elsewhere in the literature. The more flexible framework is achieved through introducing novel parameterizations of the model, giving varying degrees of parsimony, using exponential family models whose structure may be exploited in various theoretical and algorithmic ways. The algorithms are based on variational generalized EM algorithms, where the E-steps are augmented by a minorization-maximization (MM) idea. The bootstrapped standard error estimates are based on an efficient Monte Carlo network simulation idea. Last, we demonstrate the usefulness of the model-based clustering framework by applying it to a discrete-valued network with more than 131,000 nodes and 17 billion edge variables.

Citation

Download Citation

Duy Q. Vu. David R. Hunter. Michael Schweinberger. "Model-based clustering of large networks." Ann. Appl. Stat. 7 (2) 1010 - 1039, June 2013. https://doi.org/10.1214/12-AOAS617

Information

Published: June 2013

First available in Project Euclid: 27 June 2013

zbMATH: 1288.62106

MathSciNet: MR3113499

Digital Object Identifier: 10.1214/12-AOAS617

Keywords: EM Algorithms , finite mixture models , generalized EM algorithms , MM algorithms , social networks , stochastic block models , variational EM algorithms

Access the abstract

JOURNAL ARTICLE
30 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY