Open Access
2018 Estimating a network from multiple noisy realizations
Can M. Le, Keith Levin, Elizaveta Levina
Electron. J. Statist. 12(2): 4697-4740 (2018). DOI: 10.1214/18-EJS1521
Abstract

Complex interactions between entities are often represented as edges in a network. In practice, the network is often constructed from noisy measurements and inevitably contains some errors. In this paper we consider the problem of estimating a network from multiple noisy observations where edges of the original network are recorded with both false positives and false negatives. This problem is motivated by neuroimaging applications where brain networks of a group of patients with a particular brain condition could be viewed as noisy versions of an unobserved true network corresponding to the disease. The key to optimally leveraging these multiple observations is to take advantage of network structure, and here we focus on the case where the true network contains communities. Communities are common in real networks in general and in particular are believed to be presented in brain networks. Under a community structure assumption on the truth, we derive an efficient method to estimate the noise levels and the original network, with theoretical guarantees on the convergence of our estimates. We show on synthetic networks that the performance of our method is close to an oracle method using the true parameter values, and apply our method to fMRI brain data, demonstrating that it constructs stable and plausible estimates of the population network.

References

1.

[1] E. Abbe. Community detection and stochastic block models: Recent developments., Journal of Machine Learning Research, 18(177):1–86, 2018. MR3827065 06982933[1] E. Abbe. Community detection and stochastic block models: Recent developments., Journal of Machine Learning Research, 18(177):1–86, 2018. MR3827065 06982933

2.

[2] A. A. Amini, A. Chen, P. J. Bickel, and E. Levina. Fitting community models to large sparse networks., Annals of Statistics, 41(4) :2097–2122, 2013. 1277.62166 10.1214/13-AOS1138 euclid.aos/1382547514[2] A. A. Amini, A. Chen, P. J. Bickel, and E. Levina. Fitting community models to large sparse networks., Annals of Statistics, 41(4) :2097–2122, 2013. 1277.62166 10.1214/13-AOS1138 euclid.aos/1382547514

3.

[3] S. Balakrishnan, M. J. Wainwright, and B. Yu. Statistical guarantees for the EM algorithm: From population to sample-based analysis., Annals of Statistics, 45(1):77–120, 2017. 1367.62052 10.1214/16-AOS1435 euclid.aos/1487667618[3] S. Balakrishnan, M. J. Wainwright, and B. Yu. Statistical guarantees for the EM algorithm: From population to sample-based analysis., Annals of Statistics, 45(1):77–120, 2017. 1367.62052 10.1214/16-AOS1435 euclid.aos/1487667618

4.

[4] D. S. Bassett, B. G. Nelson, B. A. Mueller, J. Camchong, and K. O. Lim. Altered resting state complexity in schizophrenia., Neuroimage, 59(3) :2196–2207, 2012.[4] D. S. Bassett, B. G. Nelson, B. A. Mueller, J. Camchong, and K. O. Lim. Altered resting state complexity in schizophrenia., Neuroimage, 59(3) :2196–2207, 2012.

5.

[5] P. J. Bickel and A. Chen. A nonparametric view of network models and Newman-Girvan and other modularities., Proc. Natl. Acad. Sci. USA, 106 :21068–21073, 2009. 1359.62411 10.1073/pnas.0907096106[5] P. J. Bickel and A. Chen. A nonparametric view of network models and Newman-Girvan and other modularities., Proc. Natl. Acad. Sci. USA, 106 :21068–21073, 2009. 1359.62411 10.1073/pnas.0907096106

6.

[6] P. J. Bickel and K. A. Doksum., Mathematical statistics: Basic ideas and selected topics–2nd edition (updated printing), volume 1. Pearson Prentice Hall, 2007. 0403.62001[6] P. J. Bickel and K. A. Doksum., Mathematical statistics: Basic ideas and selected topics–2nd edition (updated printing), volume 1. Pearson Prentice Hall, 2007. 0403.62001

7.

[7] S. Boucheron, G. Lugosi, and P. Massart., Concentration inequalities: a nonasymptotic theory of independence. Oxford University Press, 2013. 1279.60005[7] S. Boucheron, G. Lugosi, and P. Massart., Concentration inequalities: a nonasymptotic theory of independence. Oxford University Press, 2013. 1279.60005

8.

[8] R. L. Buckner, J. Sepulcre, T. Talukdar, F. Krienen, H. Liu, T. Hedden, J. R. Andrews-Hanna, R. A. Sperling, and K. A. Johnson. Cortical hubs revealed by intrinsic functional connectivity: Mapping, assessment of stability, and relation to alzheimer’s disease., J Neurosci, 29(6) :1860–1873, 2009.[8] R. L. Buckner, J. Sepulcre, T. Talukdar, F. Krienen, H. Liu, T. Hedden, J. R. Andrews-Hanna, R. A. Sperling, and K. A. Johnson. Cortical hubs revealed by intrinsic functional connectivity: Mapping, assessment of stability, and relation to alzheimer’s disease., J Neurosci, 29(6) :1860–1873, 2009.

9.

[9] K. Chen and J. Lei. Network cross-validation for determining the number of communities in network data., arXiv :1411.1715, 2014. 1398.62159 10.1080/01621459.2016.1246365[9] K. Chen and J. Lei. Network cross-validation for determining the number of communities in network data., arXiv :1411.1715, 2014. 1398.62159 10.1080/01621459.2016.1246365

10.

[10] P. Chin, A. Rao, and V. Vu. Stochastic block model and community detection in the sparse graphs: A spectral algorithm with optimal rate of recovery., arXiv :1501.05021, 2015.[10] P. Chin, A. Rao, and V. Vu. Stochastic block model and community detection in the sparse graphs: A spectral algorithm with optimal rate of recovery., arXiv :1501.05021, 2015.

11.

[11] D. Choi and P. Wolfe. Co-clustering separately exchangeable network data., Annals of Statistics, 42(1):29–63, 2014. 1294.62059 10.1214/13-AOS1173 euclid.aos/1389795744[11] D. Choi and P. Wolfe. Co-clustering separately exchangeable network data., Annals of Statistics, 42(1):29–63, 2014. 1294.62059 10.1214/13-AOS1173 euclid.aos/1389795744

12.

[12] A. P. Dawid and A. M. Skene. Maximum likelihood estimation of observer error-rates using the em algorithm., Applied statistics, pages 20–28, 1979.[12] A. P. Dawid and A. M. Skene. Maximum likelihood estimation of observer error-rates using the em algorithm., Applied statistics, pages 20–28, 1979.

13.

[13] S. Fortunato. Community detection in graphs., Physics Reports, 486(3-5):75 – 174, 2010.[13] S. Fortunato. Community detection in graphs., Physics Reports, 486(3-5):75 – 174, 2010.

14.

[14] C. Gao, Z. Ma, A. Y. Zhang, and H. H. Zhou. Achieving optimal misclassification proportion in stochastic block model., arXiv :1505.03772, 2015. 06781381[14] C. Gao, Z. Ma, A. Y. Zhang, and H. H. Zhou. Achieving optimal misclassification proportion in stochastic block model., arXiv :1505.03772, 2015. 06781381

15.

[15] A. Ghosh, S. Kale, and R. P. McAfee. Who moderates the moderators?: crowdsourcing abuse detection in user-generated content. In, EC, 2011.[15] A. Ghosh, S. Kale, and R. P. McAfee. Who moderates the moderators?: crowdsourcing abuse detection in user-generated content. In, EC, 2011.

16.

[16] A. Goldenberg, A. X. Zheng, S. E. Fienberg, and E. M. Airoldi. A survey of statistical network models., Foundations and Trends in Machine Learning, 2:129–233, 2010. 1184.68030 10.1561/2200000005[16] A. Goldenberg, A. X. Zheng, S. E. Fienberg, and E. M. Airoldi. A survey of statistical network models., Foundations and Trends in Machine Learning, 2:129–233, 2010. 1184.68030 10.1561/2200000005

17.

[17] P. W. Holland, K. B. Laskey, and S. Leinhardt. Stochastic blockmodels: first steps., Social Networks, 5(2):109–137, 1983.[17] P. W. Holland, K. B. Laskey, and S. Leinhardt. Stochastic blockmodels: first steps., Social Networks, 5(2):109–137, 1983.

18.

[18] S. H. Hosseini and S. R. Kesler. Comparing connectivity pattern and small-world organization between structural correlation and resting-state networks in healthy adults., Neuroimage, 78:402–414, 2013.[18] S. H. Hosseini and S. R. Kesler. Comparing connectivity pattern and small-world organization between structural correlation and resting-state networks in healthy adults., Neuroimage, 78:402–414, 2013.

19.

[19] A. Joseph and B. Yu. Impact of regularization on spectral clustering., Ann. Statist., 44(4) :1765–1791, 2016. 1357.62229 10.1214/16-AOS1447 euclid.aos/1467894715[19] A. Joseph and B. Yu. Impact of regularization on spectral clustering., Ann. Statist., 44(4) :1765–1791, 2016. 1357.62229 10.1214/16-AOS1447 euclid.aos/1467894715

20.

[20] B. Karrer and M. E. J. Newman. Stochastic blockmodels and community structure in networks., Physical Review E, 83 :016107, 2011.[20] B. Karrer and M. E. J. Newman. Stochastic blockmodels and community structure in networks., Physical Review E, 83 :016107, 2011.

21.

[21] F. Klimm, D. S. Bassett, J. M. Carlson, and P. J. Mucha. Resolving structural variability in network models and the brain., PLoS Comput Biol, 10(3): e1003491, 2014.[21] F. Klimm, D. S. Bassett, J. M. Carlson, and P. J. Mucha. Resolving structural variability in network models and the brain., PLoS Comput Biol, 10(3): e1003491, 2014.

22.

[22] V. Koltchinskii., Oracle inequalities in empirical risk minimization and sparse recovery problems. Springer Verlag, 2011. 1223.91002[22] V. Koltchinskii., Oracle inequalities in empirical risk minimization and sparse recovery problems. Springer Verlag, 2011. 1223.91002

23.

[23] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborová, and P. Zhang. Spectral redemption in clustering sparse networks., Proceedings of the National Academy of Sciences, 110(52) :20935–20940, 2013. 1359.62252 10.1073/pnas.1312486110[23] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborová, and P. Zhang. Spectral redemption in clustering sparse networks., Proceedings of the National Academy of Sciences, 110(52) :20935–20940, 2013. 1359.62252 10.1073/pnas.1312486110

24.

[24] C. M. Le and E. Levina. Estimating the number of communities in networks by spectral methods., arXiv :1507.00827, 2015.[24] C. M. Le and E. Levina. Estimating the number of communities in networks by spectral methods., arXiv :1507.00827, 2015.

25.

[25] C. M. Le, E. Levina, and R. Vershynin. Sparse random graphs: regularization and concentration of the laplacian., arXiv :1502.03049, 2015. 1373.05179 10.1002/rsa.20713[25] C. M. Le, E. Levina, and R. Vershynin. Sparse random graphs: regularization and concentration of the laplacian., arXiv :1502.03049, 2015. 1373.05179 10.1002/rsa.20713

26.

[26] C. M. Le, E. Levina, and R. Vershynin. Concentration and regularization of random graphs., Random Struct. Alg., doi: 10.1002/rsa.20713, 2017. 1373.05179 10.1002/rsa.20713[26] C. M. Le, E. Levina, and R. Vershynin. Concentration and regularization of random graphs., Random Struct. Alg., doi: 10.1002/rsa.20713, 2017. 1373.05179 10.1002/rsa.20713

27.

[27] M. Ledoux and M. Talagrand., Probability in Banach spaces: Isoperimetry and processes. Springer-Verlag, Berlin, 1991. 0748.60004[27] M. Ledoux and M. Talagrand., Probability in Banach spaces: Isoperimetry and processes. Springer-Verlag, Berlin, 1991. 0748.60004

28.

[28] K. Levin, A. Athreya, M. Tang, V. Lyzinski, and C. E. Priebe. A central limit theorem for an omnibus embedding of random dot product graphs., arXiv :1705.09355, 2017.[28] K. Levin, A. Athreya, M. Tang, V. Lyzinski, and C. E. Priebe. A central limit theorem for an omnibus embedding of random dot product graphs., arXiv :1705.09355, 2017.

29.

[29] Q. Liu, J. Peng, and A. T. Ihler. Variational inference for crowdsourcing., Advances in neural information processing systems, pages 692–700, 2012.[29] Q. Liu, J. Peng, and A. T. Ihler. Variational inference for crowdsourcing., Advances in neural information processing systems, pages 692–700, 2012.

30.

[30] V. Lyzinski, D. L. Sussman, M. Tang, A. Athreya, and C. E. Priebe. Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding., Electronic Journal of Statistics, 8(2) :2905–2922, 2014. 1308.62131 10.1214/14-EJS978[30] V. Lyzinski, D. L. Sussman, M. Tang, A. Athreya, and C. E. Priebe. Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding., Electronic Journal of Statistics, 8(2) :2905–2922, 2014. 1308.62131 10.1214/14-EJS978

31.

[31] M. Narayan and G. I. Allen. Mixed effects models for resampled network statistics improves statistical power to find differences in multi-subject functional connectivity., Frontiers in Neuroscience, 10(108), 2016.[31] M. Narayan and G. I. Allen. Mixed effects models for resampled network statistics improves statistical power to find differences in multi-subject functional connectivity., Frontiers in Neuroscience, 10(108), 2016.

32.

[32] M. Narayan, G. I. Allen, and S. Tomson. Two sample inference for populations of graphical models with applications to functional connectivity., arXiv :1502.03853, 2015.[32] M. Narayan, G. I. Allen, and S. Tomson. Two sample inference for populations of graphical models with applications to functional connectivity., arXiv :1502.03853, 2015.

33.

[33] M. E. J. Newman. Fast algorithm for detecting community structure in networks., Phys. Rev. E, 69(6) :066133, 2004.[33] M. E. J. Newman. Fast algorithm for detecting community structure in networks., Phys. Rev. E, 69(6) :066133, 2004.

34.

[34] M. E. J. Newman. Modularity and community structure in networks., Proc. Natl. Acad. Sci. USA, 103(23) :8577–8582, 2006.[34] M. E. J. Newman. Modularity and community structure in networks., Proc. Natl. Acad. Sci. USA, 103(23) :8577–8582, 2006.

35.

[35] S. C. Olhede and P. J. Wolfe. Network histograms and universality of blockmodel approximation., Proceedings of the National Academy of Sciences of the USA, 111 :14722–14727, 2014.[35] S. C. Olhede and P. J. Wolfe. Network histograms and universality of blockmodel approximation., Proceedings of the National Academy of Sciences of the USA, 111 :14722–14727, 2014.

36.

[36] J. D. Power, A. L. Cohen, S. M. Nelson, G. S. Wig, K. A. Barnes, J. A. Church, A. C. Vogel, T. O. Laumann, F. M. Miezin, B. L. Schlaggar, and S. E. Petersen. Functional network organization of the human brain., Neuron, 72(4):665–78, 2011.[36] J. D. Power, A. L. Cohen, S. M. Nelson, G. S. Wig, K. A. Barnes, J. A. Church, A. C. Vogel, T. O. Laumann, F. M. Miezin, B. L. Schlaggar, and S. E. Petersen. Functional network organization of the human brain., Neuron, 72(4):665–78, 2011.

37.

[37] V. C. Raykar, S. Yu, L. H. Zhao, G. H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds., Journal of Machine Learning Research, 11 :1297–1322, 2010.[37] V. C. Raykar, S. Yu, L. H. Zhao, G. H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds., Journal of Machine Learning Research, 11 :1297–1322, 2010.

38.

[38] M. Rubinov and O. Sporns. Complex network measures of brain connectivity: uses and interpretations., NeuroImage, 52(3) :1059–1069, 2010.[38] M. Rubinov and O. Sporns. Complex network measures of brain connectivity: uses and interpretations., NeuroImage, 52(3) :1059–1069, 2010.

39.

[39] R. Tang, M. Ketcha, J. T. Vogelstein, C. E. Priebe, and D. L. Sussman. Law of large graphs., arXiv :1609.01672, 2016.[39] R. Tang, M. Ketcha, J. T. Vogelstein, C. E. Priebe, and D. L. Sussman. Law of large graphs., arXiv :1609.01672, 2016.

40.

[40] S. Taylor, A. Chen, I. Tso, I. Liberzon, and R. Welsh. Social appraisal in chronic psychosis: Role of medial frontal and occipital networks., J. Psychiatr. Res., 45(4):526–538, 2011.[40] S. Taylor, A. Chen, I. Tso, I. Liberzon, and R. Welsh. Social appraisal in chronic psychosis: Role of medial frontal and occipital networks., J. Psychiatr. Res., 45(4):526–538, 2011.

41.

[41] S. Taylor, E. Demeter, K. Phan, I. Tso, and R. Welsh. Abnormal gabaergic function and negative affect in schizophrenia., Neuropsychopharmacology, 39(4) :1000–1008, 2014.[41] S. Taylor, E. Demeter, K. Phan, I. Tso, and R. Welsh. Abnormal gabaergic function and negative affect in schizophrenia., Neuropsychopharmacology, 39(4) :1000–1008, 2014.

42.

[42] J. A. Tropp. User-friendly tail bounds for sums of random matrices., Foundations of Computational Mathematics, 12(4):389–434, 2012. 1259.60008 10.1007/s10208-011-9099-z[42] J. A. Tropp. User-friendly tail bounds for sums of random matrices., Foundations of Computational Mathematics, 12(4):389–434, 2012. 1259.60008 10.1007/s10208-011-9099-z

43.

[43] L. Wang, Z. Zhang, and D. Dunson. Common and individual structure of multiple networks., arXiv :1707.06360, 2017.[43] L. Wang, Z. Zhang, and D. Dunson. Common and individual structure of multiple networks., arXiv :1707.06360, 2017.

44.

[44] R. Wang and P. Bickel. Likelihood-based model selection for stochastic block models., arXiv :1502.02069, 2015.[44] R. Wang and P. Bickel. Likelihood-based model selection for stochastic block models., arXiv :1502.02069, 2015.

45.

[45] M. Xia, J. Wang, and Y. He. Brainnet viewer: A network visualization tool for human brain connectomics., PLoS ONE 8: e68910, 2013.[45] M. Xia, J. Wang, and Y. He. Brainnet viewer: A network visualization tool for human brain connectomics., PLoS ONE 8: e68910, 2013.

46.

[46] Y. Zhang, X. Chen, D. Zhou, and M. I. Jordan. Spectral methods meet em: A provably optimal algorithm for crowdsourcing., Advances in neural information processing systems, page 1260–1268, 2014.[46] Y. Zhang, X. Chen, D. Zhou, and M. I. Jordan. Spectral methods meet em: A provably optimal algorithm for crowdsourcing., Advances in neural information processing systems, page 1260–1268, 2014.
Can M. Le, Keith Levin, and Elizaveta Levina "Estimating a network from multiple noisy realizations," Electronic Journal of Statistics 12(2), 4697-4740, (2018). https://doi.org/10.1214/18-EJS1521
Received: 1 October 2017; Published: 2018
Vol.12 • No. 2 • 2018
Back to Top