Electronic Journal of Probability

Contiguity and non-reconstruction results for planted partition models: the dense case

Debapratim Banerjee

Full-text: Open access

Abstract

We consider the two block stochastic block model on $n$ nodes with asymptotically equal cluster sizes. The connection probabilities within and between cluster are denoted by $p_n:=\frac{a_n} {n}$ and $q_n:=\frac{b_n} {n}$ respectively. Mossel et al. [27] considered the case when $a_n=a$ and $b_n=b$ are fixed. They proved the probability models of the stochastic block model and that of Erdős–Rényi graph with same average degree are mutually contiguous whenever $(a-b)^2<2(a+b)$ and are asymptotically singular whenever $(a-b)^2>2(a+b)$. Mossel et al. [27] also proved that when $(a-b)^2<2(a+b)$ no algorithm is able to find an estimate of the labeling of the nodes which is positively correlated with the true labeling. It is natural to ask what happens when $a_n$ and $b_n$ both grow to infinity. In this paper we consider the case when $a_{n} \to \infty $, $\frac{a_n} {n} \to p \in [0,1)$ and $(a_n-b_n)^2= \Theta (a_n+b_n)$. Observe that in this case $\frac{b_n} {n} \to p$ also. We show that here the models are mutually contiguous if asymptotically $(a_n-b_n)^2< 2(1-p)(a_n+b_n)$ and they are asymptotically singular if asymptotically $(a_n-b_n)^2 > 2(1-p)(a_n+b_n)$. Further we also prove it is impossible find an estimate of the labeling of the nodes which is positively correlated with the true labeling whenever $(a_n-b_n)^2< 2(1-p)(a_n+b_n)$ asymptotically. The results of this paper justify the negative part of a conjecture made in Decelle et al. (2011) [17] for dense graphs.

Article information

Source
Electron. J. Probab., Volume 23 (2018), paper no. 18, 28 pp.

Dates
Received: 22 November 2016
Accepted: 27 November 2017
First available in Project Euclid: 23 February 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejp/1519354947

Digital Object Identifier
doi:10.1214/17-EJP128

Mathematical Reviews number (MathSciNet)
MR3771755

Zentralblatt MATH identifier
1387.05230

Subjects
Primary: 05C80: Random graphs [See also 60B20]

Keywords
stochastic block model planted partition model threshold phase transition community detection random network linear statistics

Rights
Creative Commons Attribution 4.0 International License.

Citation

Banerjee, Debapratim. Contiguity and non-reconstruction results for planted partition models: the dense case. Electron. J. Probab. 23 (2018), paper no. 18, 28 pp. doi:10.1214/17-EJP128. https://projecteuclid.org/euclid.ejp/1519354947


Export citation

References

  • [1] E. Abbe. Community detection and stochastic block models: recent developments. J. Mach. Learn. Res., To appear, 2017.
  • [2] E. Abbe and C. Sandon. Achieving the ks threshold in the general stochastic block model with linearized acyclic belief propagation. In Advances in Neural Information Processing Systems 29, pages 1334–1342, 2016.
  • [3] E. Abbe, A. S. Bandeira, and G. Hall. Exact recovery in the stochastic block model. IEEE Transactions on Information Theory, 62(1):471–487, 2016.
  • [4] G. W. Anderson and O. Zeitouni. A CLT for a band matrix model. Probab. Theory Related Fields, 134(2):283–338, 2006.
  • [5] G. W. Anderson, A. Guionnet, and O. Zeitouni. An introduction to random matrices. Cambridge University Press, Cambridge, 2010.
  • [6] D. Banerjee and A. Bose. Largest eigenvalue of large random block matrices: A combinatorial approach. Random Matrices: Theory and Applications, 6(02):1750008, 2017.
  • [7] D. Banerjee and Z. Ma. Optimal hypothesis testing for stochastic block models with growing degrees. arXiv preprint arXiv:1705.05305, 2017.
  • [8] J. Banks, C. Moore, J. Neeman, and P. Netrapalli. Information-theoretic thresholds for community detection in sparse networks. In Conference on Learning Theory, pages 383–416, 2016.
  • [9] P. J. Bickel and A. Chen. A nonparametric view of network models and newman–girvan and other modularities. Proceedings of the National Academy of Sciences, 106(50):21068–21073, 2009.
  • [10] R. B. Boppana. Eigenvalues and graph bisection: An average-case analysis. In 28th Annual Symposium on Foundations of Computer Science, pages 280–285. IEEE, 1987.
  • [11] C. Bordenave, M. Lelarge, and L. Massoulié. Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs. Ann. probab., To Appear.
  • [12] S. Bubeck, J. Ding, R. Eldan, and M. Z. Rácz. Testing for high-dimensional geometry in random graphs. Random Structures & Algorithms, 49(3):503–532, 2016.
  • [13] T. N. Bui, S. Chaudhuri, F. T. Leighton, and M. Sipser. Graph bisection algorithms with good average case behavior. Combinatorica, 7(2):171–191, 1987.
  • [14] T. Carleman. Les fonctions quasi analytiques(in French). Leçons professées au Collège de France. 1926.
  • [15] A. Coja-Oghlan. Graph partitioning via adaptive spectral techniques. Combinatorics, Probability & Computing, 19(2):227–284, 2010.
  • [16] A. Condon and R. M. Karp. Algorithms for graph partitioning on the planted partition model. Random Structures & Algorithms, 18(2):116–140, 2001.
  • [17] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Physics Review E, 84(6):066106, Dec. 2011.
  • [18] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1–38, 1977.
  • [19] M. E. Dyer and A. M. Frieze. The solution of some random np-hard problems in polynomial expected time. J. Algorithms, 10(4):451–489, Dec. 1989.
  • [20] L. Isserlis. On a formula for the product-moment coefficient of any order of a normal frequency distribution in any number of variables. Biometrika, 12(1/2):134–139, 1918.
  • [21] S. Janson. Random regular graphs: asymptotic distributions and contiguity. Combin. Probab. Comput., 4(4):369–405, 1995.
  • [22] S. C. Johnson. Hierarchical clustering schemes. Psychometrika, 32(3):241–254, 1967.
  • [23] C. L. Mallows. A note on asymptotic joint normality. Ann. Math. Statist., 43(2):508–515, 1972.
  • [24] L. Massoulié. Community detection thresholds and the weak ramanujan property. In STOC 2014: 46th Annual Symposium on the Theory of Computing, pages 1–10, New York, United States, June 2014.
  • [25] F. McSherry. Spectral partitioning of random graphs. In 42nd IEEE Symposium on Foundations of Computer Science, pages 529–537, Oct 2001.
  • [26] E. Mossel, J. Neeman, and A. Sly. A Proof Of The Block Model Threshold Conjecture. Combinatorica, To Appear.
  • [27] E. Mossel, J. Neeman, and A. Sly. Reconstruction and estimation in the planted partition model. Probab. Theory Related Fields, 162(3–4):431–461, 2015.
  • [28] E. Mossel, J. Neeman, and A. Sly. Consistency thresholds for the planted bisection model. Electron. J. Probab., 21:1–24, 2016.
  • [29] M. E. J. Newman, D. J. Watts, and S. H. Strogatz. Random graph models of social networks. Proceedings of the National Academy of Sciences, 99(suppl 1):2566–2572, 2002.
  • [30] J. K. Pritchard, M. Stephens, and P. Donnelly. Inference of population structure using multilocus genotype data. Genetics, 155(2):945–959, 2000.
  • [31] K. Rohe, S. Chatterjee, and B. Yu. Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist., 39(4):1878–1915, 08 2011.
  • [32] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, Aug. 2000.
  • [33] M. Sonka, V. Hlavac, and R. Boyle. Image Processing, Analysis, and Machine Vision. Thomson-Engineering, 2007.
  • [34] G. C. Wick. The evaluation of the collision matrix. Phys. Rev., 80:268–272, Oct 1950.
  • [35] N. C. Wormald. Models of random regular graphs. In Surveys in Combinatorics, 1999, pages 239–298. Cambridge University Press, 1999.