The Annals of Applied Statistics

Discussion of “Coauthorship and citation networks for statisticians”

Song Wang and Karl Rohe

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Pengsheng Ji and Jiashun Jin have collected and analyzed a fun and fascinating data set that we are eager to use as an example in a course on Statistical Network Analysis. In this comment, we partition the core of the paper citation graph and interpret the clusters by analyzing the paper abstracts using bag-of-words. Under the Stochastic Block Model (SBM), the eigengap reveals the number of clusters. We find several eigengaps and that there are still clusters beyond the largest eigengap. Through this illustration, we argue against a simplistic interpretation of model selection results from the Stochastic Block Model (SBM) literature. In short, don’t mind the gap.

Article information

Ann. Appl. Stat., Volume 10, Number 4 (2016), 1820-1826.

Received: August 2016
First available in Project Euclid: 5 January 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Networks spectral clustering text analysis eigengap


Wang, Song; Rohe, Karl. Discussion of “Coauthorship and citation networks for statisticians”. Ann. Appl. Stat. 10 (2016), no. 4, 1820--1826. doi:10.1214/16-AOAS977.

Export citation


  • Bates, D. and Maechler, M. (2016). Matrix: Sparse and dense matrix classes and methods. R package version 1.2-6. Available at
  • Csardi, G. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems 1695.
  • Ji, P. and Jin, J. (2014). Coauthorship and citation networks for statisticians. Preprint. Available at arXiv:1410.2840.
  • Jin, J. (2015). Fast community detection by SCORE. Ann. Statist. 43 57–89.
  • Meyer, D., Hornik, K. and Feinerer, I. (2008). Text mining infrastructure in R. J. Stat. Softw. 25 1–54.
  • Qiu, Y. and Mei, J. (2016). rARPACK: Solvers for large scale eigenvalue and svd problems. R package version 0.11-0. Available at
  • Tai Qin and Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In Advances in Neural Information Processing Systems 3120–3128.
  • Wang, S. and Rohe, K. (2016). Supplement to “Discussion of “Coauthorship and citation networks for statisticians”.” DOI:10.1214/16-AOAS977SUPP.

See also

  • Main article: Coauthorship and citation networks for statisticians.

Supplemental materials

  • Code and Data. We provide the code and data sets to reproduce our results in this discussion.