Statistical Science

Construction of Weights in Surveys: A Review

David Haziza and Jean-François Beaumont

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Weighting is one of the central steps in surveys. The typical weighting process involves three major stages. At the first stage, each unit is assigned a base weight, which is defined as the inverse of its inclusion probability. The base weights are then modified to account for unit nonresponse. At the last stage, the nonresponse-adjusted weights are further modified to ensure consistency between survey estimates and known population totals. When needed, the weights undergo a last modification through weight trimming or weight smoothing methods in order to improve the efficiency of survey estimates. This article provides an overview of the various stages involved in the typical weighting process used by national statistical offices.

Article information

Source
Statist. Sci., Volume 32, Number 2 (2017), 206-226.

Dates
First available in Project Euclid: 11 May 2017

Permanent link to this document
https://projecteuclid.org/euclid.ss/1494489812

Digital Object Identifier
doi:10.1214/16-STS608

Mathematical Reviews number (MathSciNet)
MR3648956

Zentralblatt MATH identifier
1381.62025

Keywords
Calibration estimator design-based framework expansion estimator propensity score adjusted estimator unequal probability sampling unit nonresponse weight smoothing weight trimming weighting system

Citation

Haziza, David; Beaumont, Jean-François. Construction of Weights in Surveys: A Review. Statist. Sci. 32 (2017), no. 2, 206--226. doi:10.1214/16-STS608. https://projecteuclid.org/euclid.ss/1494489812


Export citation

References

  • Basu, D. (1971). An essay on the logical foundations of survey sampling, Part I. In Foundations of Statistical Inference (V. P. Godambe and D. A. Sprott, eds.) 203–242. Holt, Rinehart and Winston, Toronto.
  • Beaumont, J.-F. (2005a). Calibrated imputation in surveys under a quasi-model-assisted approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 445–458.
  • Beaumont, J.-F. (2005b). On the use of data collection process information for the treatment of unit nonreponse through weight adjustment. Surv. Methodol. 31 227–231.
  • Beaumont, J.-F. (2008). A new approach to weighting and inference in sample surveys. Biometrika 95 539–553.
  • Beaumont, J.-F. and Bocci, C. (2008). Another look at ridge calibration. Metron—International Journal of Statistics LXIV 5–20.
  • Beaumont, J.-F. and Haziza, D. (2016). A note on the concept of invariance in two-phase sampling designs. Surv. Methodol. 42 319–323.
  • Breidt, F. J. and Opsomer, J. D. (2016). Model-assisted survey estimation with modern prediction techniques. Statist. Sci. 32 190–205.
  • Brewer, K. R. W. (2002). Combined Survey Sampling Inference: Weighing Basu’s Elephants. Arnold, London.
  • Chambers, R. L. (1996). Robust case-weighting for multipurpose establishment surveys. J. Off. Stat. 12 3–32.
  • Chang, T. and Kott, P. S. (2008). Using calibration weighting to adjust for nonresponse under a plausible model. Biometrika 95 555–571.
  • Chen, Q., Elliot, M. R., Haziza, D., Yang, Y., Gosh, M., Little, R., Sedransk, J. and Thompson, M. (2016). Weights and estimation of a survey population mean: A review. Statist. Sci. To appear.
  • Cloutier, E. and Langlet, É. (2014). Aboriginal Peoples Survey, 2012: Concepts and methods guide. Statistics Canada, Catalogue No. 89-653-X—No. 002, ISBN 978-1-100-22738-2.
  • Da Silva, D. N. and Opsomer, J. D. (2006). A kernel smoothing method of adjusting for unit non-response in sample surveys. Canad. J. Statist. 34 563–579.
  • Da Silva, D. N. and Opsomer, J. D. (2009). Nonparametric propensity weighting for survey nonresponse through local polynomial regression. Surv. Methodol. 35 165–176.
  • Deming, W. E. and Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann. Math. Stat. 11 427–444.
  • Deville, J.-C. (1998). La correction de la non-réponse par calage ou par échantillonnage équilibré. In Actes du Colloque de la Société Statistique du, Canada, Sherbrooke, Canada.
  • Deville, J.-C. (2002). La correction de la non-réponse par calage généralisé. Actes des Journées de Méthodologie Statistique, Insee.
  • Deville, J.-C. and Lavallée, P. (2006). Indirect sampling: The foundation of the generalized weight share method. Surv. Methodol. 32 165–176.
  • Deville, J.-C. and Särndal, C.-E. (1992). Calibration estimators in survey sampling. J. Amer. Statist. Assoc. 87 376–382.
  • Deville, J.-C., Särndal, C.-E. and Sautory, O. (1993). Generalized raking procedure in survey sampling. J. Amer. Statist. Assoc. 88 1013–1020.
  • Ekholm, A. and Laaksonen, S. (1991). Weighting via response modeling in the Finnish household budget survey. J. Off. Stat. 7 325–337.
  • Eltinge, J. L. and Yansaneh, I. S. (1997). Diagnostics for formation of nonresponse adjustment cells, with an application to income nonresponse in the U.S. consumer expenditure survey. Surv. Methodol. 23 33–40.
  • Estevao, V. M. and Särndal, C.-E. (2000). A functional form approach to calibration. J. Off. Stat. 16 379–399.
  • Estevao, V. M. and Särndal, C.-E. (2006). Survey estimates by calibration on complex auxiliary information. Int. Stat. Rev. 74 127–147.
  • Firth, B. and Bennett, K.-E. (1998). Robust models in probability sampling. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 3–21.
  • Folsom, R. E. and Singh, A. C. (2000). The generalized exponential model for sampling weight calibration for extreme values, nonresponse, and poststratification. In Proceedings of the Survey Research Methods Section 598–603. Amer. Statist. Assoc., Alexandria, VA.
  • Fuller, W. A. (1966). Estimation employing post strata. J. Amer. Statist. Assoc. 61 1172–1183.
  • Gelman, A. (2007). Struggles with survey weighting and regression modeling. Statist. Sci. 22 153–164.
  • Hájek, J. (1971). Comment on “An essay on the logical foundations of survey sampling” by Basu, D. In Foundations of Statistical Inference (V. P. Godambe and D. A. Sprott, eds.) 236. Holt, Rinehart, and Winston, Toronto.
  • Haziza, D. and Beaumont, J.-F. (2007). On the construction of imputation classes in surveys. Int. Stat. Rev. 75 25–43.
  • Haziza, D. and Lesage, E. (2016). A discussion of weighting procedures for unit nonresponse. J. Off. Stat. 32 129–145.
  • Hidiroglou, M. A. and Patak, Z. (2004). Domain estimation using linear regression. Surv. Methodol. 30 67–78.
  • Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47 663–685.
  • Iannacchione, V. G., Milne, J. G. and Folsom, R. E. (1991). Response probability weight adjustments using logistic regression. In Proceedings of the Survey Research Methods Section, 637–642. Amer. Statist. Assoc., Alexandria, VA.
  • Kalberg, F. (2000). Survey estimation for highly skewed populations in the presence of zeroes. J. Off. Stat. 16 229–241.
  • Kalton, G. and Flores-Cervantes, I. (2003). Weighting methods. J. Off. Stat. 19 81–97.
  • Kass, V. G. (1980). An exploratory technique for investigating large quantities of categorical data. J. R. Stat. Soc. Ser. C. Appl. Stat. 29 119–127.
  • Kim, J. K. and Kim, J. J. (2007). Nonresponse weighting adjustment using estimated response probability. Canad. J. Statist. 35 501–514.
  • Kim, J. K., Kwon, Y. and Paik, M. H. C. (2016). Calibrated propensity score method for survey nonresponse in cluster sampling. Biometrika 103 461–473.
  • Kim, J. K. and Park, M. (2010). Calibration estimation in survey sampling. Int. Stat. Rev. 78 21–39.
  • Kish, L. (1992). Weighting for unequal $P_{i}$. J. Off. Stat. 8 183–200.
  • Kmenta, J. (1971). Elements of Econometrics. MacMillan, New York.
  • Kott, P. S. (2006). Using calibration weighting to adjust for nonresponse and undercoverage. Surv. Methodol. 32 133–142.
  • Kott, P. S. (2009). Calibration weighting: Combining probability samples and linear prediction models. In Handbook of Statistics 29B, Sample Surveys: Inference and Analysis. Elsevier, Oxford.
  • Lavallée, P. (2007). Indirect Sampling. Springer, New York.
  • Lavallée, P. and Beaumont, J.-F. (2016). Weighting: Principles and practicalities. In The SAGE Handbook of Survey Methodology (C. Wolf, D. Joye, T. W. Smith and Y.-C. Fu, eds.) 460–476, Chapter 30. Sage Publications, Thousand Oaks, CA.
  • Lemaître, G. and Dufour, J. (1987). An integrated method for weighting persons and families. Surv. Methodol. 13 199–207.
  • Little, R. J. A. (1986). Survey nonresponse adjustments for estimates of means. Int. Stat. Rev. 54 139–157.
  • Little, R. J. A. and Vartivarian, S. (2005). Does weighting for nonresponse increase the variance of survey means? Surv. Methodol. 31 161–168.
  • Lundström, S. and Särndal, C.-E. (1999). Calibration as a standard method for the treatment of nonresponse. J. Off. Stat. 15 305–327.
  • Montanari, G. E. (1987). Post-sampling efficient QR-prediction in large-sample surveys. Int. Stat. Rev. 55 191–202.
  • Montanari, G. E. and Ranalli, M. G. (2005). Nonparametric model calibration estimation in survey sampling. J. Amer. Statist. Assoc. 100 1429–1442.
  • Narain, R. D. (1951). On sampling without replacement with varying probabilities. J. Indian Soc. Agricultural Statist. 3 169–174.
  • Phipps, P. and Toth, D. (2012). Analyzing establishment nonresponse using an interpretable regression tree model with linked administrative data. Ann. Appl. Stat. 6 772–794.
  • Potter, F. (1990). A study of procedures to identify and trim extreme sampling weights. In Proceedings of the Section on Survey Research Methods 225–230. Amer. Statist. Assoc., Alexandria, VA.
  • Rao, J. N. K. (1965). On two simple schemes of unequal probability sampling without replacement. J. Indian Statist. Assoc. 6 173–180.
  • Rao, J. N. K. (1966). Alternative estimators in sampling for multiple characteristics. Sankhyā Ser. A 28 47–60.
  • Rao, J. N. K. (1994). Estimating totals and distribution functions using auxiliary information at the estimation stage. J. Off. Stat. 10 153–165.
  • Rao, J. N. K. (2005). Interplay between sample survey theory and practice: An appraisal. Surv. Methodol. 31 117–138.
  • Rao, J. N. K. and Molina, I. (2015). Small Area Estimation, 2nd ed. Wiley, Hoboken, NJ.
  • Rao, J. N. K., Hidiroglou, M., Yung, W. and Kovacevic, M. (2010). Role of weights in descriptive and analytical inferences from survey data: An overview. J. Indian Soc. Agricultural Statist. 64 129–135, 325.
  • Rubin, D. B. (1976). Inference and missing data. Biometrika 63 581–592.
  • Salgado, D., Pérez-Arriero, C., Herrador, M. and Arbués, I. (2012). Letter to the Editor. J. Off. Stat. 28 471–476.
  • Sampford, M. R. (1967). On sampling without replacement with unequal probabilities of selection. Biometrika 54 499–513.
  • Särndal, C.-E. (2007). The calibration approach in survey theory and practice. Surv. Methodol. 33 99–119.
  • Särndal, C.-E. and Lundström, S. (2005). Estimation in Surveys with Nonresponse. Wiley, New York.
  • Särndal, C.-E. and Swensson, B. (1987). A general view of estimation for two phases of selection with applications to two-phase sampling and nonresponse. Int. Stat. Rev. 55 279–294.
  • Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
  • Sautory, O. (2003). CALMAR2: A new version of the CALMAR calibration adjustment program. In Proceedings of Statistics Canada’s Symposium 2003. Available at http://www.statcan.ca/english/freepub/11-522-XIE/2003001/session13/sautory.pdf.
  • Scott, A. and Smith, T. M. F. (1969). A note on estimating secondary caracteristics in multivariate surveys. Sankya A 31 497–498.
  • Silva, P. L. N. and Skinner, C. (1997). Variable selection for regression estimation in finite populations. Surv. Methodol. 23 23–32.
  • Skinner, C. J. and D’Arrigo, J. (2011). Inverse probability weighting for clustered nonresponse. Biometrika 98 953–966.
  • Skinner, C. J. and Wakefield, J. (2017). Introduction to the design and analysis of comlex survey data. Statist. Sci. 32 165–175.
  • Steel, D. G. and Clark, R. G. (2007). Person-level and household-level regression estimation in household surveys. Surv. Methodol. 33 51–60.
  • Tillé, Y. (2017). Probability sampling designs: New methods and guidelines. Statist. Sci. 32 176–189.
  • Wu, C. (2003). Optimal calibration estimators in survey sampling. Biometrika 90 937–951.
  • Wu, C. and Lu, W. W. (2016). Calibration weighting methods for complex surveys. Int. Stat. Rev. 84 79–98.
  • Wu, C. and Rao, J. N. K. (2006). Pseudo empirical likelihood ratio confidence intervals for complex surveys. Canad. J. Statist. 34 359–375.
  • Wu, C. and Sitter, R. R. (2001). A model-calibration approach to using complete auxiliary information from survey data. J. Amer. Statist. Assoc. 96 185–193.