The Annals of Applied Statistics

A simulation-based framework for assessing the feasibility of respondent-driven sampling for estimating characteristics in populations of lesbian, gay and bisexual older adults

Maryclare Griffin, Krista J. Gile, Karen I. Fredricksen-Goldsen, Mark S. Handcock, and Elena A. Erosheva

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Respondent-driven sampling (RDS) is a method for sampling from a target population by leveraging social connections. RDS is invaluable to the study of hard-to-reach populations. However, RDS is costly and can be infeasible. RDS is infeasible when RDS point estimators have small effective sample sizes (large design effects) or when RDS interval estimators have large confidence intervals relative to estimates obtained in previous studies or poor coverage. As a result, researchers need tools to assess whether or not estimation of certain characteristics of interest for specific populations is feasible in advance. In this paper, we develop a simulation-based framework for using pilot data—in the form of a convenience sample of aggregated, egocentric data and estimates of subpopulation sizes within the target population—to assess whether or not RDS is feasible for estimating characteristics of a target population. In doing so, we assume that more is known about egos than alters in the pilot data, which is often the case with aggregated, egocentric data in practice. We build on existing methods for estimating the structure of social networks from aggregated, egocentric sample data and estimates of subpopulation sizes within the target population. We apply this framework to assess the feasibility of estimating the proportion male, proportion bisexual, proportion depressed and proportion infected with HIV/AIDS within three spatially distinct target populations of older lesbian, gay and bisexual adults using pilot data from the Caring and Aging with Pride Study and the Gallup Daily Tracking Survey. We conclude that using an RDS sample of 300 subjects is infeasible for estimating the proportion male, but feasible for estimating the proportion bisexual, proportion depressed and proportion infected with HIV/AIDS in all three target populations.

Article information

Source
Ann. Appl. Stat., Volume 12, Number 4 (2018), 2252-2278.

Dates
Received: July 2016
Revised: February 2018
First available in Project Euclid: 13 November 2018

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1542078044

Digital Object Identifier
doi:10.1214/18-AOAS1151

Mathematical Reviews number (MathSciNet)
MR3875700

Keywords
Respondent-driven sampling hard to reach populations social networks aggregated egocentric sample data network sampling

Citation

Griffin, Maryclare; Gile, Krista J.; Fredricksen-Goldsen, Karen I.; Handcock, Mark S.; Erosheva, Elena A. A simulation-based framework for assessing the feasibility of respondent-driven sampling for estimating characteristics in populations of lesbian, gay and bisexual older adults. Ann. Appl. Stat. 12 (2018), no. 4, 2252--2278. doi:10.1214/18-AOAS1151. https://projecteuclid.org/euclid.aoas/1542078044


Export citation

References

  • Admiraal, R. and Handcock, M. S. (2016). Modeling concurrency and selective mixing in heterosexual partnership networks with applications to sexually transmitted diseases. Ann. Appl. Stat. 10 2021–2046.
  • Andresen, E. M., Malmgren, J. A., Carter, W. B. and Patrick, D. L. (1994). Screening for depression in well older adults: Evaluation of a short form of the CES-D (Center for Epidemiologic Studies Depression Scale). Am. J. Prev. Med. 10 77–84.
  • Barash, V. D., Cameron, C. J., Spiller, M. W. and Heckathorn, D. D. (2016). Respondent-driven sampling—Testing assumptions: Sampling with replacement. J. Off. Stat. 32 29–73.
  • Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Routledge, Hillsdale, NJ.
  • Cornwell, B., Laumann, E. O. and Schumm, L. P. (2008). The social connectedness of older adults: A national profile. Am. Sociol. Rev. 73 185–203.
  • Crawford, F. W., Aronow, P. M., Zeng, L. and Li, J. (2018). Identification of homophily and preferential recruitment in respondent-driven sampling. Am. J. Epidemiol. 187 153–160.
  • Erosheva, E. A., Kim, H.-J., Emlet, C. and Fredriksen-Goldsen, K. I. (2016). Social networks of lesbian, gay, bisexual, and transgender older adults. Research on Aging 38 98–123.
  • Fredriksen-Goldsen, K. I. and Muraco, A. (2010). Aging and sexual orientation: A 25-year review of the literature. Research on Aging 32 372–413.
  • Fredriksen-Goldsen, K. I., Emlet, C. A., Kim, H.-J., Muraco, A., Erosheva, E. A., Goldsen, J. and Hoy-Ellis, C. P. (2013). The physical and mental health of lesbian, gay male, and bisexual (LGB) older adults: The role of key health indicators and risk and protective factors. The Gerontologist 53 664–675.
  • Fredriksen-Goldsen, K. I., Kim, H.-J., Shiu, C., Goldsen, J. and Emlet, C. A. (2015). Successful aging among LGBT older adults: Physical and mental health-related quality of life by age group. The Gerontologist 55 154–168.
  • Gabor, C. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems 1695 1–9.
  • Gates, G. J. (2013). LGBT Parenting in the United States. Technical report, The Williams Institute, UCLA School of Law.
  • Gile, K. J. (2011). Improved inference for respondent-driven sampling data with application to HIV prevalence estimation. J. Amer. Statist. Assoc. 106 135–146.
  • Gile, K. J. and Handcock, M. S. (2010). Respondent-driven sampling: An assessment of current methodology. Sociol. Method. 40 285–327.
  • Gile, K. J. and Handcock, M. S. (2015). Network model-assisted inference from respondent-driven sampling data. J. Roy. Statist. Soc. Ser. A 178 619–639.
  • Gile, K. J., Johnston, L. G. and Salganik, M. J. (2015). Diagnostics for respondent-driven sampling. J. Roy. Statist. Soc. Ser. A 178 241–269.
  • Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. J. R. Stat. Soc. Ser. B. Stat. Methodol. 69 243–268.
  • Goel, S. and Salganik, M. J. (2010). Assessing respondent-driven sampling. Proc. Natl. Acad. Sci. USA 107 6743–6747.
  • Griffin, M., Gile, K. J., Fredricksen-Goldsen, K. I., Handcock, M. S. and Erosheva, E. A. (2018). Supplement to “A simulation-based framework for assessing the feasibility of respondent-driven sampling for estimating characteristics in populations of lesbian, gay and bisexual older adults.” DOI:10.1214/18-AOAS1151SUPP.
  • Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M. and Morris, M. (2003). statnet: A suite of R packages for the statistical modeling of social networks. Software library.
  • Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M. and Morris, M. (2008). ergm: A package to fit, simulate and diagnose exponential-family models for networks. J. Stat. Softw. 24 1–29.
  • Handcock, M. S., Hunter, D. R., Butts, C. T., Goodreau, S. M. and Morris, M. (2013). ergm: Fit, simulate and analyze exponential-family models for networks. Statnet Project.
  • Heckathorn, D. D. (1997). Respondent-driven sampling: A new approach to the study of hidden populations. Soc. Probl. 44 174–199.
  • Johnston, L. G., Whitehead, S., Simic-Lawson, M. and Kendall, C. (2010). Formative research to optimize respondent-driven sampling surveys among hard-to-reach populations in HIV behavioral and biological surveillance: Lessons learned from four case studies. AIDS Care 22 784–792.
  • Kogan, S. M., Wejnert, C., Chen, Y.-F., Brody, G. H. and Slater, L. M. (2011). Respondent-driven sampling with hard-to-reach emerging adults: An introduction and case study with rural African americans. Journal of Adolescent Research 26 30–60.
  • Li, X. and Rohe, K. (2017). Central limit theorems for network driven sampling. Electron. J. Stat. 11 4871–4895.
  • Lohr, S. L. (2010). Sampling: Design and Analysis, 2nd ed. Brooks/Cole, Cengage Learning, Boston, MA.
  • Lu, X., Bengtsson, L., Britton, T., Camitz, M., Kim, B. J., Thorson, A. and Liljeros, F. (2012). The sensitivity of respondent-driven sampling. J. Roy. Statist. Soc. Ser. A 175 191–216.
  • Malekinejad, M., Johnston, L. G., Kendall, C., Kerr, L. R. F. S., Rifkin, M. R. and Rutherford, G. W. (2008). Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: A systematic review. AIDS and Behavior 12 105–130.
  • McCreesh, N., Frost, S. D. W., Seeley, J., Katongole, J., Tarsh, M. N., Ndunguse, R., Jichi, F., Lunel, N. L., Maher, D., Johnston, L. G., Sonnenberg, P., Copas, A. J., Hayes, R. J. and White, R. G. (2012). Evaluation of respondent-driven sampling. Epidemiology 23 138–147.
  • Merli, M. G., Moody, J., Smith, J., Li, J., Weir, S. and Chen, X. (2015). Challenges to recruiting population representative samples of female sex workers in China using respondent driven sampling. Social Science and Medicine 125 79–93.
  • R Core Team (2013). R: A Language and Environment for Statistical Computing.
  • Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement 1 385–401.
  • Rohe, K. (2015). Network driven sampling; a critical threshold for design effects. Preprint. Available at arXiv:1505.05461.
  • Salganik, M. J. (2006). Variance estimation, design effects, and sample size calculations for respondent-driven sampling. Journal of Urban Health 83 98–112.
  • Salganik, M. J. and Heckathorn, D. D. (2004). Sampling and estimation in hidden populations using respondent-driven sampling. Sociol. Method. 34 193–239.
  • Tomas, A. and Gile, K. J. (2011). The effect of differential recruitment, non-response and non-recruitment on estimators for respondent-driven sampling. Electron. J. Stat. 5 899–934.
  • Verdery, A. M., Mouw, T., Bauldry, S. and Mucha, P. J. (2015). Network structure and biased variance estimation in respondent driven sampling. PLoS ONE 10 1–27.
  • Volz, E. and Heckathorn, D. D. (2008). Probability based estimation theory for respondent driven sampling. J. Off. Stat. 24 79–97.
  • Wejnert, C., Pham, H., Krishna, N., Le, B. and DiNenno, E. (2012). Estimating design effect and calculating sample size for respondent-driven sampling studies of injection drug users in the United States. AIDS and Behavior 16 797–806.
  • Ye, Y. (1987). Interior algorithms for linear, quadratic, and linearly constrained non-linear programming. Ph.D. thesis, Stanford Univ.
  • Zea, M. C. (2010). Reaction to the special issue on centralizing the experiences of LGB people of color in counseling psychology. The Counseling Psychologist 38 425–433.

Supplemental materials

  • Additional details and replication materials. The file contains a document titled “Supplement to ‘A simulation-based framework for assessing the feasibility of respondent-driven sampling for estimating characteristics in populations of lesbian, gay and bisexual older adults,’” which includes the five sections referenced in this paper. The file also contains the source code for a package for R that includes code written by the authors and synthetic data. A vignette titled “Synthetic data example of methods used in ‘A simulation-based framework for assessing the feasibility of respondent-driven sampling for estimating characteristics in populations of lesbian, gay and bisexual older adults’” implements the methods used in this paper for the synthetic data. Additionally, a read me file is included with instructions for installing the R package and accessing the vignette.