Model Selection for Social Networks Using Graphlets

Jeannette Janssen; Matt Hurshman; Nauzer Kalyaniwalla

doi:im/1354809988

2012 Model Selection for Social Networks Using Graphlets

Jeannette Janssen, Matt Hurshman, Nauzer Kalyaniwalla

Internet Math. 8(4): 338-363 (2012).

Abstract

Several network models have been proposed to explain the link structure observed in online social networks. This paper addresses the problem of choosing the model that best fits a given real-world network. We implement a model-selection method based on unsupervised learning. An alternating decision tree is trained using synthetic graphs generated according to each of the models under consideration. We use a broad array of features, with the aim of representing different structural aspects of the network. Features include the frequency counts of small subgraphs (graphlets) as well as features capturing the degree distribution and small-world property. Our method correctly classifies synthetic graphs, and is robust under perturbations of the graphs. We show that the graphlet counts alone are sufficient in separating the training data, indicating that graphlet counts are a good way of capturing network structure. We tested our approach on four Facebook graphs from various American universities. The models that best fit these data are those that are based on the principle of preferential attachment.