## The Annals of Statistics

### Rate-optimal graphon estimation

#### Abstract

Network analysis is becoming one of the most active research areas in statistics. Significant advances have been made recently on developing theories, methodologies and algorithms for analyzing networks. However, there has been little fundamental study on optimal estimation. In this paper, we establish optimal rate of convergence for graphon estimation. For the stochastic block model with $k$ clusters, we show that the optimal rate under the mean squared error is $n^{-1}\log k+k^{2}/n^{2}$. The minimax upper bound improves the existing results in literature through a technique of solving a quadratic equation. When $k\leq\sqrt{n\log n}$, as the number of the cluster $k$ grows, the minimax rate grows slowly with only a logarithmic order $n^{-1}\log k$. A key step to establish the lower bound is to construct a novel subset of the parameter space and then apply Fano’s lemma, from which we see a clear distinction of the nonparametric graphon estimation problem from classical nonparametric regression, due to the lack of identifiability of the order of nodes in exchangeable random graph models. As an immediate application, we consider nonparametric graphon estimation in a Hölder class with smoothness $\alpha$. When the smoothness $\alpha\geq1$, the optimal rate of convergence is $n^{-1}\log n$, independent of $\alpha$, while for $\alpha\in(0,1)$, the rate is $n^{-2\alpha/(\alpha+1)}$, which is, to our surprise, identical to the classical nonparametric rate.

#### Article information

Source
Ann. Statist., Volume 43, Number 6 (2015), 2624-2652.

Dates
Revised: June 2015
First available in Project Euclid: 7 October 2015

https://projecteuclid.org/euclid.aos/1444222087

Digital Object Identifier
doi:10.1214/15-AOS1354

Mathematical Reviews number (MathSciNet)
MR3405606

Zentralblatt MATH identifier
1332.60050

Subjects
Primary: 60G05: Foundations of stochastic processes

#### Citation

Gao, Chao; Lu, Yu; Zhou, Harrison H. Rate-optimal graphon estimation. Ann. Statist. 43 (2015), no. 6, 2624--2652. doi:10.1214/15-AOS1354. https://projecteuclid.org/euclid.aos/1444222087

#### References

• [1] Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008). Mixed membership stochastic blockmodels. J. Match. Learn. Res. 9 1981–2014.
• [2] Airoldi, E. M., Costa, T. B. and Chan, S. H. (2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. Adv. Neural Inf. Process. Syst. 26 692–700.
• [3] Aldous, D. J. (1981). Representations for partially exchangeable arrays of random variables. J. Multivariate Anal. 11 581–598.
• [4] Amini, A. A., Chen, A., Bickel, P. J. and Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. Ann. Statist. 41 2097–2122.
• [5] Anandkumar, A., Ge, R., Hsu, D. and Kakade, S. M. (2014). A tensor approach to learning mixed membership community models. J. Mach. Learn. Res. 15 2239–2312.
• [6] Bickel, P., Choi, D., Chang, X. and Zhang, H. (2013). Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels. Ann. Statist. 41 1922–1943.
• [7] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and newman–girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068–21073.
• [8] Cai, T. T. and Li, X. (2015). Robust and computationally feasible community detection in the presence of arbitrary outlier nodes. Ann. Statist. 43 1027–1059.
• [9] Chan, S. H. and Airoldi, E. M. (2014). A consistent histogram estimator for exchangeable graph models. Preprint. Available at arXiv:1402.1888.
• [10] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177–214.
• [11] Cheng, Y. and Church, G. M. (2000). Biclustering of expression data. In Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology 93–103. AAAI.
• [12] Chin, P., Rao, A. and Vu, V. (2015). Stochastic block model and community detection in the sparse graphs: A spectral algorithm with optimal rate of recovery. Preprint. Available at arXiv:1501.05021.
• [13] Coifman, R. R. and Gavish, M. (2011). Harmonic analysis of digital data bases. In Wavelets and Multiscale Analysis. Appl. Numer. Harmon. Anal. 161–197. Birkhäuser, New York.
• [14] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33–61.
• [15] Gao, C., Lu, Y. and Zhou, H. H. (2015). Supplement to “Rate-optimal graphon estimation.” DOI:10.1214/15-AOS1354SUPP.
• [16] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2015). Achieving optimal misclassification proportion in stochastic block model. Preprint. Available at arXiv:1505.03772.
• [17] Gavish, M. and Coifman, R. R. (2012). Sampling, denoising and compression of matrices by coherent matrix organization. Appl. Comput. Harmon. Anal. 33 354–369.
• [18] Gavish, M., Nadler, B. and Coifman, R. R. (2010). Multiscale wavelets on trees, graphs and high dimensional data: Theory and applications to semi supervised learning. In Proceedings of the 27th International Conference on Machine Learning (ICML-10) 367–374. Omnipress, Madison, WI.
• [19] Girvan, M. and Newman, M. E. J. (2002). Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99 7821–7826 (electronic).
• [20] Goldenberg, A., Zheng, A. X., Fienberg, S. E. and Airoldi, E. M. (2010). A survey of statistical network models. Faund. Trends Mach. Learn. 2 129–233.
• [21] Guimerà, R. and Sales-Pardo, M. (2009). Missing and spurious interactions and the reconstruction of complex networks. Proc. Natl. Acad. Sci. USA 106 22073–22078.
• [22] Guntuboyina, A. (2011). Lower bounds for the minimax risk using $f$-divergences, and applications. IEEE Trans. Inform. Theory 57 2386–2399.
• [23] Hajek, B., Wu, Y. and Xu, J. (2014). Achieving exact cluster recovery threshold via semidefinite programming. Preprint. Available at arXiv:1412.6156.
• [24] Handcock, M. S., Raftery, A. E. and Tantrum, J. M. (2007). Model-based clustering for social networks. J. Roy. Statist. Soc. Ser. A 170 301–354.
• [25] Hartigan, J. A. (1972). Direct clustering of a data matrix. J. Amer. Statist. Assoc. 67 123–129.
• [26] Hoff, P. (2008). Modeling homophily and stochastic equivalence in symmetric relational data. In Advances in Neural Information Processing Systems 657–664. MIT Press, Cambridge, MA.
• [27] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks 5 109–137.
• [28] Holland, P. W. and Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. J. Amer. Statist. Assoc. 76 33–65.
• [29] Hoover, D. N. (1979). Relations on Probability Spaces and Arrays of Random Variables. Institute for Advanced Study, Princeton, NJ.
• [30] Joseph, A. and Yu, B. (2013). Impact of regularization on spectral clustering. Preprint. Available at arXiv:1312.1733.
• [31] Kallenberg, O. (1989). On the representation theorem for exchangeable arrays. J. Multivariate Anal. 30 137–154.
• [32] Karrer, B. and Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107, 10.
• [33] Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43 215–237.
• [34] Lloyd, J., Orbanz, P., Ghahramani, Z. and Roy, D. (2013). Random function priors for exchangeable arrays with applications to graphs and relational data. Adv. Neural Inf. Process. Syst. 25 1007–1015.
• [35] Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society Colloquium Publications 60. Amer. Math. Soc., Providence, RI.
• [36] Lovász, L. and Szegedy, B. (2006). Limits of dense graph sequences. J. Combin. Theory Ser. B 96 933–957.
• [37] Lü, L. and Zhou, T. (2011). Link prediction in complex networks: A survey. Phys. A 390 1150–1170.
• [38] Madeira, S. C. and Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: A survey. Computational Biology and Bioinformatics, IEEE/ACM Transactions on 1 24–45.
• [39] Massart, P. (2007). Concentration Inequalities and Model Selection. Lecture Notes in Math. 1896. Springer, Berlin.
• [40] Mirkin, B. (1998). Mathematical classification and clustering: From how to what and why. In Classification, Data Analysis, and Data Highways (Potsdam, 1997) 172–181. Springer, Berlin.
• [41] Mossel, E., Neeman, J. and Sly, A. (2014). Consistency thresholds for binary symmetric block models. Preprint. Available at arXiv:1407.1591.
• [42] Newman, M. E. and Leicht, E. A. (2007). Mixture models and exploratory analysis in networks. Proc. Natl. Acad. Sci. USA 104 9564–9569.
• [43] Nowicki, K. and Snijders, T. A. B. (2001). Estimation and prediction for stochastic blockstructures. J. Amer. Statist. Assoc. 96 1077–1087.
• [44] Olhede, S. C. and Wolfe, P. J. (2014). Network histograms and universality of blockmodel approximation. Proc. Natl. Acad. Sci. USA 111 14722–14727.
• [45] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878–1915.
• [46] Sarkar, P., Chakrabarti, D. and Jordan, M. (2012). Nonparametric link prediction in dynamic networks. Preprint. Available at arXiv:1206.6394.
• [47] Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation. Springer, New York.
• [48] Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge Univ. Press, Cambridge.
• [49] Wasserman, S. (1994). Social Network Analysis: Methods and Applications. Cambridge Univ. Press, Cambridge.
• [50] Wolfe, P. J. and Olhede, S. C. (2013). Nonparametric graphon estimation. Preprint. Available at arXiv:1309.5936.
• [51] Yu, B. (1997). Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam 423–435. Springer, New York.
• [52] Yu, H., Braun, P., Yıldırım, M. A., Lemmens, I., Venkatesan, K., Sahalie, J., Hirozane-Kishikawa, T., Gebreab, F., Li, N., Simonis, N. et al. (2008). High-quality binary protein interaction map of the yeast interactome network. Science 322 104–110.
• [53] Zhang, A. Y. and Zhou, H. H. (2015). Minimax rates of community detection in stochastic block model. Available at http://www.stat.yale.edu/~yz482/community_detection_minimax.pdf.
• [54] Zhang, X., Wang, X., Zhao, C., Yi, D. and Xie, Z. (2014). Degree-corrected stochastic block models and reliability in networks. Phys. A 393 553–559.
• [55] Zhao, Y., Levina, E. and Zhu, J. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Statist. 40 2266–2292.