Statistics Surveys

A survey of bootstrap methods in finite population sampling

Zeinab Mashreghi, David Haziza, and Christian Léger

Full-text: Open access

Abstract

We review bootstrap methods in the context of survey data where the effect of the sampling design on the variability of estimators has to be taken into account. We present the methods in a unified way by classifying them in three classes: pseudo-population, direct, and survey weights methods. We cover variance estimation and the construction of confidence intervals for stratified simple random sampling as well as some unequal probability sampling designs. We also address the problem of variance estimation in presence of imputation to compensate for item non-response.

Article information

Source
Statist. Surv., Volume 10 (2016), 1-52.

Dates
Received: December 2014
First available in Project Euclid: 15 March 2016

Permanent link to this document
https://projecteuclid.org/euclid.ssu/1458047831

Digital Object Identifier
doi:10.1214/16-SS113

Mathematical Reviews number (MathSciNet)
MR3476140

Zentralblatt MATH identifier
1351.62025

Subjects
Primary: 62D05: Sampling theory, sample surveys
Secondary: 62F40: Bootstrap, jackknife and other resampling methods

Keywords
Bootstrap bootstrap weights confidence intervals imputation missing data multistage designs pseudo-population approach survey sampling unequal probability sampling variance estimation

Citation

Mashreghi, Zeinab; Haziza, David; Léger, Christian. A survey of bootstrap methods in finite population sampling. Statist. Surv. 10 (2016), 1--52. doi:10.1214/16-SS113. https://projecteuclid.org/euclid.ssu/1458047831


Export citation

References

  • Aitkin, M. (2008). Applications of the bayesian bootstrap in finite population inference. Journal of Official Statistics 24(1), 21–51.
  • Antal, E. and Y. Tillé (2011). A direct bootstrap method for complex sampling designs from a finite population. Journal of the American Statistical Association 106(494), 534–543.
  • Antal, E. and Y. Tillé (2014). A new resampling method for sampling designs without replacement: the doubled half bootstrap. Computational Statistics 29(5), 1345–1363.
  • Barbe, P. and P. Bertail (1995). The weighted bootstrap, Lecture notes in statistics, Volume 98. Springer-Verlag, New York.
  • Barbiero, A., G. Manzi, and F. Mecatti (2015). Bootstrapping probability-proportional-to-size samples via calibrated empirical population. Journal of Statistical Computation and Simulation 85(3), 608–620.
  • Barbiero, A. and F. Mecatti (2010). Bootstrap algorithms for variance estimation in $\pi$ps sampling. In Complex data modeling and computationally intensive statistical methods, pp. 57–69. Springer.
  • Beaumont, J.-F. and A.-S. Charest (2012). Bootstrap variance estimation with survey data when estimating model parameters. Computational Statistics and Data Analysis 56(12), 4450–4461.
  • Beaumont, J.-F. and Z. Patak (2012). On the generalized bootstrap for sample surveys with special attention to Poisson sampling. International Statistical Review 80(1), 127–148.
  • Berger, Y. G. (2007). A jackknife variance estimator for unistage stratified samples with unequal probabilities. Biometrika 94(4), 953–964.
  • Berger, Y. G. and C. J. Skinner (2005). A jackknife variance estimator for unequal probability sampling. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67(1), 79–89.
  • Bertail, P. and P. Combris (1997). Bootstrap généralisé d’un sondage. Annales d’économie et de statistique 46, 49–83.
  • Bickel, P. J. and D. A. Freedman (1983). Asymptotic normality and the bootstrap in stratified sampling. Unpublished manuscript. Department of Statistics, University of California, Berkeley.
  • Bickel, P. J. and D. A. Freedman (1984). Asymptotic normality and the bootstrap in stratified sampling. The Annals of Statistics 12(2), 470–482.
  • Binder, D. A. (2011). Estimating model parameters from a complex survey under a model-design randomization framework. Pakistan Journal of Statistics 27(4), 371–390.
  • Booth, J. G., R. W. Butler, and P. Hall (1994). Bootstrap methods for finite populations. Journal of the American Statistical Association 89(428), 1282–1289.
  • Campbell, C. and A. D. Little (1980). A different view of finite population estimation. In Proceedings of the Section on Survey Research Methods, pp. 319–324.
  • Canty, A. J. and A. C. Davison (1999). Resampling-based variance estimation for labour force surveys. Journal of the Royal Statistical Society, Series D: The Statistician 48, 379–391.
  • Carota, C. (2009). Beyond objective priors for the bayesian bootstrap analysis of survey data. Journal of Official Statistics 25(3), 405–413.
  • Chao, M. T. and S.-H. Lo (1985). A bootstrap method for finite population. Sankhyā: The Indian Journal of Statistics, Series A 47, 399–405.
  • Chao, M. T. and S.-H. Lo (1994). Maximum likelihood summary and the bootstrap method in structured finite populations. Statistica Sinica 4(2), 389–406.
  • Chaudhuri, A. and A. Saha (2004). Extending sitter’s mirror-match bootstrap to cover rao-hartley-cochran sampling in two-stages with simulated illustrations. Sankhya: The Indian Journal of Statistics 66(4), 791–802.
  • Chauvet, G. (2007). Méthodes de bootstrap en population finie. Ph. D. thesis, Université de Rennes 2.
  • Chipperfield, J. and J. Preston (2007). Efficient bootstrap for business surveys. Survey Methodology 33(2), 167–172.
  • Davison, A. and D. Hinkley (1997). Bootstrap Methods and Their Application. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.
  • Deville, J. C. (1999). Variance estimation for complex statistics and estimators: Linearization and residual techniques. Survey Methodology 25(2), 193–203.
  • Durbin, J. (1959). A note on the application of Quenouille’s method of bias reduction to the estimation of ratios. Biometrika 46(3-4), 477–480.
  • Efron, B. (1979). Bootstrap methods: another look at the jackknife. The Annals of Statistics 7(1), 1–26.
  • Escobar, E. L. and Y. G. Berger (2013). A jackknife variance estimator for self-weighted two-stage samples. Statistica Sinica 23(2), 595–613.
  • Fay, R. E. (1991). A design-based perspective on missing data variance. In Proceedings of the 1991 Annual Research Conference, US Bureau of the census, pp. 429–440.
  • Fuller, W. A. (2009). Sampling statistics. Wiley, New York.
  • Funaoka, F., H. Saigo, R. R. Sitter, and T. Toida (2006). Bernoulli bootstrap for stratified multistage sampling. Survey Methodology 32(2), 151–156.
  • Gross, S. (1980). Median estimation in sample surveys. In Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 181–184.
  • Gupta, V. K. and A. K. Nigam (1987). Mixed orthogonal arrays for variance estimation with unequal numbers of primary selections per stratum. Biometrika 74(4), 735–742.
  • Gurney, M. and R. S. Jewett (1975). Constructing orthogonal replications for variance estimation. Journal of the American Statistical Association 70(352), 819–821.
  • Haziza, D. (2009). Imputation and inference in the presence of missing data. In C. Rao and D. Pfeffermann (Eds.), Handbook of Statistics 29A, Sample Surveys: Design, Methods and Applications, pp. 215–246. Elsevier.
  • Holmberg, A. (1998). A bootstrap approach to probability proportional to size sampling. In Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 378–383.
  • Horvitz, D. G. and D. J. Thompson (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association 47(260), 663–685.
  • Isaki, C. T. and W. A. Fuller (1982). Survey design under the regression superpopulation model. Journal of the American Statistical Association 77(377), 89–96.
  • Jones, H. L. (1974). Jackknife estimation of functions of stratum means. Biometrika 61(2), 343–348.
  • Kim, J. K., A. Navarro, and W. A. Fuller (2006). Replication variance estimation for two-phase stratified sampling. Journal of the American Statistical Association 101(473), 312–320.
  • Kolenikov, S. (2010). Resampling variance estimation for complex survey data. Stata Journal, The: The official journal on Stata and statistics 10(2), 165–199.
  • Kott, P. S. (1998). Using the delete-a-group jackknife variance estimator in practice. In Proceedings of the Survey Research Methods Section, American Statistical Association, pp. 763–768.
  • Kott, P. S. (2001). The delete-a-group jackknife. Journal of Official Statistics 17(4), 521–526.
  • Kovacevic, M. S., R. Huang, and Y. You (2006). Bootstrapping for variance estimation in multi-level models fitted to survey data. In Proceedings of the Survey Research Methods Section, American Statistical Association, pp. 3260–3269.
  • Kovar, J. G., J. N. K. Rao, and C. F. J. Wu (1988). Bootstrap and other methods to measure errors in survey estimates. The Canadian Journal of Statistics 16, Supplement, 25–45.
  • Krewski, D. and J. N. K. Rao (1981). Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods. The Annals of Statistics 9(5), 1010–1019.
  • Lahiri, P. (2003). On the impact of bootstrap in survey sampling and small-area estimation. Statistical Science 18(2), 199–210.
  • Lo, A. Y. (1991). Bayesian bootstrap clones and a biometry function. Sankhyā: The Indian Journal of Statistics, Series A 53(3), 320–333.
  • Lumley, T. (2010). Complex surveys: a guide to analysis using R. Wiley, Hoboken, NJ.
  • Lumley, T. (2014). survey: analysis of complex survey samples. R package version 3.30.
  • Mashreghi, Z., C. Léger, and D. Haziza (2014). Bootstrap methods for imputed data from regression, ratio and hot-deck imputation. The Canadian Journal of Statistics 42(1), 142–167.
  • Mason, D. M. and M. A. Newton (1992). A rank statistics approach to the consistency of a general bootstrap. The Annals of Statistics 20(3), 1611–1624.
  • McCarthy, P. J. (1969). Pseudo-replication: Half samples. Review of the International Statistical Institute 37(3), 239–264.
  • McCarthy, P. J. and C. B. Snowden (1985). The bootstrap and finite population sampling. Vital and Health Statistics, Series 2, No. 95. DHHS Publication No. (PHS) 85–1369. Public Health Service. Washington. U.S. Government Printing Office.
  • Miller, R. G. (1974). The jackknife-a review. Biometrika 61(1), 1–15.
  • Preston, J. (2009). Rescaled bootstrap for stratified multistage sampling. Survey Methodology 35(2), 227–234.
  • Preston, J. and J. Chipperfield (2002). Using a generalised estimation methodology for ABS business surveys. Methodology Advisory Committee, ABS, Belconnen, Australia (available at www.abs.gov.au).
  • Quenouille, M. H. (1956). Notes on bias in estimation. Biometrika 43(3-4), 353–360.
  • Ranalli, M. G. and F. Mecatti (2012). Comparing recent approaches for bootstrapping sample survey data: A first step toward a unified approach. In Proceedings of the Section on Survey Research Methods, American Statistical Association, pp. 4088–4099.
  • Rao, J. N. K. (1965). On two simple schemes of unequal probability sampling without replacement. Journal of the Indian Statistical Association 3, 173–180.
  • Rao, J. N. K., H. O. Hartley, and W. G. Cochran (1962). On a simple procedure of unequal probability sampling without replacement. Journal of the Royal Statistical Society. Series B (Methodological) 24(2), 482–491.
  • Rao, J. N. K. and J. Shao (1996). On balanced half-sample variance estimation in stratified random sampling. Journal of the American Statistical Association 91(433), 343–348.
  • Rao, J. N. K. and J. Shao (1999). Modified balanced repeated replication for complex survey data. Biometrika 86(2), 403–415.
  • Rao, J. N. K. and C. F. J. Wu (1985). Inference from stratified samples: second-order analysis of three methods for nonlinear statistics. Journal of the American Statistical Association 80(391), 620–630.
  • Rao, J. N. K. and C. F. J. Wu (1988). Resampling inference with complex survey data. Journal of the American Statistical Association 83(401), 231–241.
  • Rao, J. N. K., C. F. J. Wu, and K. Yue (1992). Some recent work on resampling methods for complex surveys. Survey Methodology 18(2), 209–217.
  • Rosén, B. (1997). Asymptotic theory for order sampling. Journal of Statistical Planning and Inference 62(2), 135–158.
  • Rubin, D. B. (1976). Inference and missing data. Biometrika 63(3), 581–592.
  • Rust, K. (1985). Variance estimation for complex estimators in sample surveys. Journal of Official Statistics 1(4), 381–397.
  • Rust, K. F. and J. N. K. Rao (1996). Variance estimation for complex surveys using replication techniques. Statistical methods in medical research 5(3), 283–310.
  • Saigo, H. (2010). Comparing four bootstrap methods for stratified three-stage sampling. Journal of Official Statistics 26(1), 193–207.
  • Saigo, H., J. Shao, and R. R. Sitter (2001). A repeated half-sample bootstrap and balanced repeated replications for randomly imputed data. Survey Methodology 27(2), 189–196.
  • Sampford, M. R. (1967). On sampling without replacement with unequal probabilities of selection. Biometrika 54(3-4), 499–513.
  • Särndal, C.-E. (2007). The calibration approach in survey theory and practice. Survey Methodology 33(2), 99–119.
  • Särndal, C.-E., B. Swensson, and J. H. Wretman (1997). Model assisted survey sampling. Berlin; New York: Springer-Verlag Inc.
  • Shao, J. (2003). Impact of the bootstrap on sample surveys. Statistical Science 18(2), 191–198.
  • Shao, J. and Y. Chen (1998). Bootstrapping sample quantiles based on complex survey data under hot deck imputation. Statistica Sinica 8(4), 1071–1085.
  • Shao, J. and J. N. K. Rao (1993). Standard errors for low income proportions estimated from stratified multi-stage samples. Sankhyā: The Indian Journal of Statistics, Series B 55(3), 393–414.
  • Shao, J. and R. R. Sitter (1996). Bootstrap for imputed survey data. Journal of the American Statistical Association 91(435), 1278–1288.
  • Shao, J. and P. Steel (1999). Variance estimation for survey data with composite imputation and nonnegligible sampling fractions. Journal of the American Statistical Association 94(445), 254–265.
  • Shao, J. and D. Tu (1995). The Jackknife and Bootstrap. Springer Series in Statistics, New York.
  • Shao, J. and C. F. J. Wu (1989). A general theory for jackknife variance estimation. The Annals of Statistics 17(3), 1176–1197.
  • Shao, J. and C. F. J. Wu (1992). Asymptotic properties of the balanced repeated replication method for sample quantiles. The Annals of Statistics 20(3), 1571–1593.
  • Sitter, R. R. (1992a). Comparing three bootstrap methods for survey data. The Canadian Journal of Statistics 20(2), 135–154.
  • Sitter, R. R. (1992b). A resampling procedure for complex survey data. Journal of the American Statistical Association 87(419), 755–765.
  • Sitter, R. R. (1993). Balanced repeated replications based on orthogonal multi-arrays. Biometrika 80(1), 211–221.
  • Tukey, J. W. (1958). Bias and confidence in not quite large samples. Abstract. The Annals of Mathematical Statistics 29, 614.
  • Wang, Z. and M. E. Thompson (2012). A resampling approach to estimate variance components of multilevel models. The Canadian Journal of Statistics 40(1), 150–171.
  • Wolter, K. M. (2007). Introduction to Variance Estimation. Springer Series in Statistics, New York.
  • Wu, C. F. J. (1991). Balanced repeated replications based on mixed orthogonal arrays. Biometrika 78(1), 181–188.