Abstract
We study graphons as a nonparametric generalization of stochastic block models, and show how to obtain compactly represented estimators for sparse networks in this framework. In contrast to previous work, we relax the usual boundedness assumption for the generating graphon and instead assume only integrability, so that we can handle networks that have long tails in their degree distributions. We also relax the usual assumption that the graphon is defined on the unit interval, to allow latent position graphs based on more general spaces.
We analyze three algorithms. The first is a least squares algorithm, which gives a consistent estimator for all square-integrable graphons, with errors expressed in terms of the best possible stochastic block model approximation. Next, we analyze an algorithm based on the cut norm, which works for all integrable graphons. Finally, we show that clustering based on degrees works whenever the underlying degree distribution is atomless.
Funding Statement
Shirshendu Ganguly was supported by an internship at Microsoft Research New England.
Acknowledgments
We thank David Choi, Sofia Olhede and Patrick Wolfe for initially introducing us to applications of graphons in machine learning of networks and, in particular, to the problem of graphon estimation. We are indebted to Sofia Olhede and Patrick Wolfe for numerous helpful discussions in the early stages of this work, to Alessandro Rinaldo for providing valuable feedback on our paper, and to the anonymous referees for their many helpful comments.
Citation
Christian Borgs. Jennifer T. Chayes. Henry Cohn. Shirshendu Ganguly. "Consistent nonparametric estimation for heavy-tailed sparse graphs." Ann. Statist. 49 (4) 1904 - 1930, August 2021. https://doi.org/10.1214/20-AOS1985
Information