Consistent nonparametric estimation for heavy-tailed sparse graphs

Christian Borgs; Jennifer T. Chayes; Henry Cohn; Shirshendu Ganguly

doi:10.1214/20-AOS1985

Abstract

We study graphons as a nonparametric generalization of stochastic block models, and show how to obtain compactly represented estimators for sparse networks in this framework. In contrast to previous work, we relax the usual boundedness assumption for the generating graphon and instead assume only integrability, so that we can handle networks that have long tails in their degree distributions. We also relax the usual assumption that the graphon is defined on the unit interval, to allow latent position graphs based on more general spaces.

We analyze three algorithms. The first is a least squares algorithm, which gives a consistent estimator for all square-integrable graphons, with errors expressed in terms of the best possible stochastic block model approximation. Next, we analyze an algorithm based on the cut norm, which works for all integrable graphons. Finally, we show that clustering based on degrees works whenever the underlying degree distribution is atomless.

Funding Statement

Shirshendu Ganguly was supported by an internship at Microsoft Research New England.

Acknowledgments

We thank David Choi, Sofia Olhede and Patrick Wolfe for initially introducing us to applications of graphons in machine learning of networks and, in particular, to the problem of graphon estimation. We are indebted to Sofia Olhede and Patrick Wolfe for numerous helpful discussions in the early stages of this work, to Alessandro Rinaldo for providing valuable feedback on our paper, and to the anonymous referees for their many helpful comments.

Citation

Download Citation

Christian Borgs. Jennifer T. Chayes. Henry Cohn. Shirshendu Ganguly. "Consistent nonparametric estimation for heavy-tailed sparse graphs." Ann. Statist. 49 (4) 1904 - 1930, August 2021. https://doi.org/10.1214/20-AOS1985

Information

Received: 1 July 2017; Revised: 1 May 2020; Published: August 2021

First available in Project Euclid: 29 September 2021

MathSciNet: MR4319235

zbMATH: 1486.62080

Digital Object Identifier: 10.1214/20-AOS1985

Subjects:

Primary: 62G20

Secondary: 05C80 , 62H30

Keywords: estimation , graphons , Sparse networks

Abstract

Funding Statement

Acknowledgments

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS