Electronic Journal of Statistics

Unequal edge inclusion probabilities in link-tracing network sampling with implications for Respondent-Driven Sampling

Miles Q. Ott and Krista J. Gile

Full-text: Open access


Respondent-Driven Sampling (RDS) is a widely adopted link-tracing sampling design used to draw valid statistical inference from samples of populations for which there is no available sampling frame. RDS estimators rely upon the assumption that each edge (representing a relationship between two individuals) in the underlying network has an equal probability of being sampled. We show that this assumption is violated in even the simplest cases, and that RDS estimators are sensitive to the violation of this assumption.

Article information

Electron. J. Statist., Volume 10, Number 1 (2016), 1109-1132.

Received: June 2015
First available in Project Euclid: 29 April 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Respondent-driven sampling link tracing network sampling edge inclusion random walk


Ott, Miles Q.; Gile, Krista J. Unequal edge inclusion probabilities in link-tracing network sampling with implications for Respondent-Driven Sampling. Electron. J. Statist. 10 (2016), no. 1, 1109--1132. doi:10.1214/16-EJS1138. https://projecteuclid.org/euclid.ejs/1461947420

Export citation


  • [1] Dhar, D. (1978). Self-avoiding random walks: Some exactly soluble cases., Journal of Mathematical Physics, 19:5–11.
  • [2] Domb, C. (2009). Self avoiding walks on lattices., Stochastic Processes in Chemical Physics, 15:229–259.
  • [3] Gile, K. J. (2011). Improved inference for respondent-driven sampling data with application to hiv prevalence estimation., Journal of the American Statistical Association, 106:135–146.
  • [4] Gile, K. J. and Handcock, M. S. (2010). Respondent-driven sampling: An assessment of current methodology., Sociological Methodology, 40:285–327.
  • [5] Gile, K. J. and Handcock, M. S. (2011). Network model-assisted inference from respondent-driven sampling data., ArXiv e-prints.
  • [6] Gobel, F. and Jagers, A. (1974). Random walks on graphs., Stochastic Processes and Their Applications, 2:311–336.
  • [7] Goodman, L. A. (1961). Snowball sampling., Annals of Mathematical Statistics, 32:148–170.
  • [8] Handcock, M. S. and Gile, K. (2011). Comment: On the concept of snowball sampling., Social Methodology, 41:367–371.
  • [9] Heckathorn, D. D. (1997). Respondent-driven sampling: A new approach to the study of hidden populations., Social Problems, 44(2):pp. 174–199.
  • [10] Heckathorn, D. D. (2002). Respondent-driven sampling ii: Deriving valid population estimates from chain referral samples of hidden populations., Social Problems, 49:11–34.
  • [11] Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe., Journal of the American Statistical Association, 47(260):663–685.
  • [12] Johnston, L. G., Malekinejad, M., Kendall, C., Iuppa, I. M., and Rutherford, G. W. (2008). Implementation challenges to using respondent-driven sampling methodology for hiv biological and behavioral surveillance: Field experiences in international settings., AIDS Behav, 12(4 Suppl):S131–S141.
  • [13] Lovász, L. (1993). Random walks on graphs: A survey., Combinatorics, 2:1–46.
  • [14] Lu, X., Bengtsson, L., Britton, T., Camitz, M., Kim, B., Thorson, A., and Liljeros, F. (2012). The sensistivity of respondent-driven sampling., JRSS:A, 175:191–216.
  • [15] Magnani, R., Sabin, K., Saidel, T., and Heckathorn, D. (2005). Review of sampling hard-to-reach and hidden populations for hiv surveillance., AIDS, 19 Suppl 2:S67–S72.
  • [16] Malekinejad, M., Johnston, L. G., Kendall, C., Kerr, L. R. F. S., Rifkin, M. R., and Rutherford, G. W. (2008). Using respondent-driven sampling methodology for hiv biological and behavioral surveillance in international settings: A systematic review., AIDS Behav, 12(4 Suppl):S105–S130.
  • [17] Potterat, J. (2004)., Network Epidemiology: A Handbook for Survey Design and Data Collection, chapter Network Dynamism: History and Lessons of the Colorado Springs Study, pages 87–114. Oxford University Press.
  • [18] Salganik, M. J. and Heckathorn, D. D. (2004). Sampling and estimation in hidden populations using respondent-driven sampling., Sociological Methodology, 34:pp. 193–239.
  • [19] Thompson, S. K. (2002)., Sampling. Wiley.
  • [20] Thompson, S. K. (2006). Targeted random walk designs., Survey Methodology, 32:11–24.
  • [21] Tomas, A. and Gile, K. J. (2011). The effect of differential recruitment, non-response and non-recruitment on estimators for respondent-driven sampling., Electonic Journal of Statistics, 5:899–934.
  • [22] Volz, E. and Heckathorn, D. D. (2008). Probability based estimation theory for respondent driven sampling., Journal of Official Statistics, 24:79–97.
  • [23] Volz, E., Wejnert, C., Cameron, C., Spiller, M., Barash, V., Degani, I., and Heckathorn, D. (2012). Respondent-driven sampling analysis tool (rdsat). Version, 7.1.
  • [24] Zachary, W. (1977). An information flow model for conflict and fission in small groups., Journal of Anthropological Research, 33:452–473.