The Annals of Applied Statistics

Discussion of “Coauthorship and citation networks for statisticians”

Song Wang and Karl Rohe

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Pengsheng Ji and Jiashun Jin have collected and analyzed a fun and fascinating data set that we are eager to use as an example in a course on Statistical Network Analysis. In this comment, we partition the core of the paper citation graph and interpret the clusters by analyzing the paper abstracts using bag-of-words. Under the Stochastic Block Model (SBM), the eigengap reveals the number of clusters. We find several eigengaps and that there are still clusters beyond the largest eigengap. Through this illustration, we argue against a simplistic interpretation of model selection results from the Stochastic Block Model (SBM) literature. In short, don’t mind the gap.

Article information

Source
Ann. Appl. Stat., Volume 10, Number 4 (2016), 1820-1826.

Dates
Received: August 2016
First available in Project Euclid: 5 January 2017

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1483606838

Digital Object Identifier
doi:10.1214/16-AOAS977

Mathematical Reviews number (MathSciNet)
MR3592035

Zentralblatt MATH identifier
06688755

Keywords
Networks spectral clustering text analysis eigengap

Citation

Wang, Song; Rohe, Karl. Discussion of “Coauthorship and citation networks for statisticians”. Ann. Appl. Stat. 10 (2016), no. 4, 1820--1826. doi:10.1214/16-AOAS977. https://projecteuclid.org/euclid.aoas/1483606838


Export citation

References

  • Bates, D. and Maechler, M. (2016). Matrix: Sparse and dense matrix classes and methods. R package version 1.2-6. Available at https://CRAN.R-project.org/package=Matrix.
  • Csardi, G. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems 1695.
  • Ji, P. and Jin, J. (2014). Coauthorship and citation networks for statisticians. Preprint. Available at arXiv:1410.2840.
  • Jin, J. (2015). Fast community detection by SCORE. Ann. Statist. 43 57–89.
  • Meyer, D., Hornik, K. and Feinerer, I. (2008). Text mining infrastructure in R. J. Stat. Softw. 25 1–54.
  • Qiu, Y. and Mei, J. (2016). rARPACK: Solvers for large scale eigenvalue and svd problems. R package version 0.11-0. Available at https://CRAN.R-project.org/package=rARPACK.
  • Tai Qin and Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In Advances in Neural Information Processing Systems 3120–3128.
  • Wang, S. and Rohe, K. (2016). Supplement to “Discussion of “Coauthorship and citation networks for statisticians”.” DOI:10.1214/16-AOAS977SUPP.

See also

  • Main article: Coauthorship and citation networks for statisticians.

Supplemental materials

  • Code and Data. We provide the code and data sets to reproduce our results in this discussion.