Statistical Science

Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal

J. N. K. Rao

Full-text: Open access

Abstract

According to Hansen, Madow and Tepping [J. Amer. Statist. Assoc. 78 (1983) 776–793], “Probability sampling designs and randomization inference are widely accepted as the standard approach in sample surveys.” In this article, reasons are advanced for the wide use of this design-based approach, particularly by federal agencies and other survey organizations conducting complex large scale surveys on topics related to public policy. Impact of Bayesian methods in survey sampling is also discussed in two different directions: nonparametric calibrated Bayesian inferences from large samples and hierarchical Bayes methods for small area estimation based on parametric models.

Article information

Source
Statist. Sci., Volume 26, Number 2 (2011), 240-256.

Dates
First available in Project Euclid: 1 August 2011

Permanent link to this document
https://projecteuclid.org/euclid.ss/1312204015

Digital Object Identifier
doi:10.1214/10-STS346

Mathematical Reviews number (MathSciNet)
MR2858389

Zentralblatt MATH identifier
1246.62061

Keywords
Bayesian pseudo-empirical likelihood design-based approach hierarchical Bayes methods model-dependent approach model-assisted methods Polya posterior small area estimation

Citation

Rao, J. N. K. Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal. Statist. Sci. 26 (2011), no. 2, 240--256. doi:10.1214/10-STS346. https://projecteuclid.org/euclid.ss/1312204015


Export citation

References

  • Aitkin, M. (2008). Applications of the Bayesian bootstrap in finite population inference. J. Off. Statist. 24 21–51.
  • Bayarri, M. J. and Berger, J. O. (2000). p values for composite null models. J. Amer. Statist. Assoc. 95 1127–1142, 1157–1170.
  • Bayarri, M. J. and Castellanos, M. E. (2007). Bayesian checking of the second levels of hierarchical models. Statist. Sci. 22 322–343.
  • Bell, W. R. (1999). Accounting for uncertainty about variances in small area estimation. In Bull. Int. Statist. Inst.: 52nd Session. Available at www.census.govt/hhes/ www/saipe under “Publications.”
  • Bell, W. R. (2008). Examining sensitivity of small area inferences to uncertainty about sampling error variances. In Proceedings of the Survey Research Methods Section 327–333. Amer. Statist. Assoc., Alexandria, VA.
  • Bellhouse, D. R. and Rao, J. N. K. (1986). On the efficiency of prediction estimators in two-stage sampling. J. Statist. Plann. Inference 13 269–281.
  • Binder, D. A. (1982). Nonparametric Bayesian models for samples from finite populations. J. Roy. Statist. Soc. Ser. B 44 388–393.
  • Bizier, V., You, Y., Veilleux, L. and Grodin, C. (2008). Model-based approach to small area estimation of disability count and rate using data from the 2006 participation and activity limitation survey. Technical report, Household Survey Methods Division, Statistics Canada.
  • Bowley, A. L. (1926). Measurement of the precision attained in sampling. Bull. Int. Statist. Inst. 22, Supplement to Liv 1 6–62.
  • van den Brakel, J. A. and Bethlehem, J. (2008). Model-based estimation for official statistics. Discussion Paper 08002, Statistics Netherlands.
  • Breidt, F. J., Claeskens, G. and Opsomer, J. D. (2005). Model-assisted estimation for complex surveys using penalised splines. Biometrika 92 831–846.
  • Brewer, K. R. W. (1963). Ratio estimation and finite populations: Some results deducible from the assumption of an underlying stochastic process. Austral. J. Statist. 5 93–105.
  • Browne, W. J. and Draper, D. (2001). A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Technical report, Institute for Education, London, England.
  • Cao, W., Tsiatis, A. and Davidian, M. (2009). Improving efficiency and robustness of the doubly robust estimators for a population mean with incomplete data. Biometrika 96 723–734.
  • Casady, R. J. and Valliant, R. (1993). Conditional properties of post-stratified estimators under normal theory. Survey Methodol. 19 183–192.
  • Datta, G. S. (2008). Private communication.
  • Datta, G. S. (2009). Model-based approach to small area estimation. In Handbook of Statistics: Sample Surveys: Inference and Analysis 29B (D. Pfeffermann and C. R. Rao, eds.) 251–288. North-Holland, Amsterdam.
  • Datta, G. S., Rao, J. N. K. and Smith, D. D. (2005). On measuring the variability of small area estimators under a basic area level model. Biometrika 92 183–196.
  • Deming, W. E. (1960). Sample Design in Business Research. Wiley, New York.
  • Deville, J.-C. and Särndal, C.-E. (1992). Calibration estimators in survey sampling. J. Amer. Statist. Assoc. 87 376–382.
  • Durbin, J. (1958). Sampling theory for estimates based on fewer individuals than the number selected. Bull. Inst. Internat. Statist. 36 113–119.
  • Ericson, W. A. (1969). Subjective Bayesian models in sampling finite populations. J. Roy. Statist. Soc. Ser. B 31 195–233.
  • Fang, K.-T. and Mukerjee, R. (2006). Empirical-type likelihoods allowing posterior credible sets with frequentist validity: Higher-order asymptotics. Biometrika 93 723–733.
  • Francisco, C. A. and Fuller, W. A. (1991). Quantile estimation with a complex survey design. Ann. Statist. 19 454–469.
  • Fuller, W. A. (2009). Sampling Statistics. Wiley, Hoboken, NJ.
  • Ganesh, N. and Lahiri, P. (2008). A new class of average moment matching priors. Biometrika 95 514–520.
  • Gelman, A. (2007). Struggles with survey weighting and regression modeling. Statist. Sci. 22 153–164.
  • Godambe, V. P. (1966). A new approach to sampling from finite populations. I. Sufficiency and linear estimation. J. Roy. Statist. Soc. Ser. B 28 310–319.
  • Hall, P. (2003). A short prehistory of the bootstrap. Statist. Sci. 18 158–167.
  • Hansen, M. H. and Hurwitz, W. N. (1943). On the theory of sampling from finite populations. Ann. Math. Statist. 14 333–362.
  • Hansen, M. H., Hurwitz, W. N., Marks, E. S. and Mauldin, W. P. (1951). Response errors in surveys. J. Amer. Statist. Assoc. 46 147–190.
  • Hansen, M. H., Hurwitz, W. N., Nisselson, H. and Steinberg, J. (1955). The redesign of the census current population survey. J. Amer. Statist. Assoc. 50 701–719.
  • Hansen, M. H., Madow, W. G. and Tepping, B. J. (1983). An evaluation of model-dependent and probability sampling inferences in sample surveys. J. Amer. Statist. Assoc. 78 776–793.
  • Hartley, H. O. (1959). Analytical studies of survey data. In Volume in Honor of Corrado Gini 1–32. Instituto di Statistica, Rome.
  • Hartley, H. O. and Rao, J. N. K. (1968). A new estimation theory for sample surveys. Biometrika 55 547–557.
  • Haziza, D. (2009). Imputation and inference in the presence of missing data. In Sample Surveys: Design, Methods and Applications. Handbook of Statist. 29 215–246. Elsevier/North-Holland, Amsterdam.
  • Haziza, D. and Rao, J. N. K. (2006). A nonresponse model approach to inference under imputation for missing survey data. Survey Methodol. 32 53–64.
  • Hoadley, B. (1969). The compound multinomial distribution and Bayesian analysis of categorical data from finite populations. J. Amer. Statist. Assoc. 64 216–229.
  • Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47 663–685.
  • Jiang, J. and Lahiri, P. (2006). Mixed model prediction and small area estimation. Test 15 1–96.
  • Kalton, G. (2002). Models in practice of survey sampling. J. Off. Statist. 18 129–154.
  • Kim, J. K. and Rao, J. N. K. (2009). A unified approach to linearization variance estimation from survey data after imputation for item nonresponse. Biometrika 96 917–932.
  • Kim, J. K., Brick, J. M., Fuller, W. A. and Kalton, G. (2006). On the bias of the multiple-imputation variance estimator in survey sampling. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 509–521.
  • Korn, E. L. and Graubard, B. I. (2003). Estimating variance components by using survey data. J. R. Stat. Soc. Ser. B Stat. Methodol. 65 175–190.
  • Kovar, J. G., Rao, J. N. K. and Wu, C. F. J. (1988). Bootstrap and other methods to measure errors in survey estimates. Canad. J. Statist. 16 25–45.
  • Lazar, N. A. (2003). Bayesian empirical likelihood. Biometrika 90 319–326.
  • Lazar, R., Meeden, G. and Nelson, D. (2008). A non-informative Bayesian approach to finite population sampling using auxiliary variables. Survey Methodol. 34 51–64.
  • Li, H. and Lahiri, P. (2010). An adjusted maximum likelihood method for solving small area estimation problems. J. Multivariate Anal. 101 882–892.
  • Little, R. J. A. (1983). Estimating a finite population mean from unequal probability samples. J. Amer. Statist. Assoc. 78 596–604.
  • Little, R. J. (2008). Weighting and prediction in sample surveys. Calcutta Statist. Assoc. Bull. 60 147–167.
  • Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data, 2nd ed. Wiley, Hoboken, NJ.
  • Lo, A. Y. (1988). A Bayesian bootstrap for a finite population. Ann. Statist. 16 1684–1695.
  • Lohr, S. L. (2007). Comment: Struggles with survey weighting and regression modeling. Statist. Sci. 22 175–178.
  • Mahalanobis, P. C. (1944). On large scale sample surveys. Phil. Trans. Roy. Soc. B 231 329–351.
  • Mahalanobis, P. C. (1946). Recent experiments in statistical sampling in the Indian Statistical Institute. J. Roy. Statist. Soc. 109 325–378.
  • Malec, D. and Sedransk, J. (1985). Bayesian inference for finite population parameters in multistage cluster sampling. J. Amer. Statist. Assoc. 80 897–902.
  • Meeden, G. (1995). Median estimation using auxiliary information. Survey Methodol. 21 71–77.
  • Meeden, G. (1999). A noninformative Bayesian approach for two-stage cluster sampling. Sankhyā Ser. B 61 133–144.
  • Meeden, G. (2003). A noninformative Bayesian approach to small area estimation. Survey Methodol. 29 19–24.
  • Meeden, G. and Vardeman, S. (1991). A noninformative Bayesian approach to interval estimation in finite population sampling. J. Amer. Statist. Assoc. 86 972–980.
  • Mohadjer, L., Rao, J. N. K., Liu, B., Krenzyke, T. and Van de Kerckhove, W. (2007). Hierarchical Bayes small area estimates of adult literacy using unmatched sampling and linking models. In Proceedings of the Survey Research Methods Section 3203–3209. Amer. Statist. Assoc., Alexandria, VA.
  • Morris, C. E. (2006). Mixed model prediction and small area estimation. Test 15 72–76.
  • Murthy, M. N. (1964). On Mahalanobis’ contributions to the development of sample survey theory and methods. In Contributions to Statistics (C. R. Rao, ed.) 283–316. Statistical Publishing Society, Calcutta, India.
  • Nandram, B. and Choi, J. W. (2005). Hierarchical Bayesian nonignorable nonresponse regression models for small area: An application to the NHANES data. Survey Methodol. 31 73–84.
  • Nandram, B., Cox, L. H. and Choi, J. W. (2005). Bayesian analysis of nonignorable missing categorical data: An application to bone mineral density and family income. Survey Methodol. 31 213–225.
  • Nandram, B., Sedransk, J. and Smith, S. J. (1997). Order-restricted Bayesian estimation of the age composition of a population of Atlantic cod. J. Amer. Statist. Assoc. 92 33–40.
  • Narain, R. D. (1951). On sampling without replacement with varying probabilities. J. Indian Soc. Agric. Statistics 3 169–174.
  • Nelson, D. and Meeden, G. (1998). Using prior information about population quantiles in finite population sampling. Sankhyā Ser. A 60 426–445.
  • Neyman, J. (1934). On the two different approaches of the representative method: The method of stratified sampling and the method of purposive selection. J. Roy. Statist. Soc. 97 558–606.
  • Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75 237–249.
  • Pfeffermann, D. (1993). The role of sampling weights when modeling survey data. Internat. Statist. Rev. 61 317–337.
  • Pfeffermann, D. (2008). Discussion. Calcutta Statist. Assoc. Bull. 60 170–175.
  • Pfeffermann, D. and Sverchkov, M. Y. (2003). Fitting generalized linear models under informative sampling. In Analysis of Survey Data (Southampton, 1999). Wiley Ser. Surv. Methodol. (R. Chambers and C. J. Shinner, eds.) 175–195. Wiley, Chichester.
  • Pfeffermann, D., Moura, F. A. S. and Silva, P. L. N. (2006). Multi-level modelling under informative sampling. Biometrika 93 943–959.
  • Pfeffermann, D., Skinner, C. J., Holmes, D. J., Goldstein, H. and Rasbash, J. (1998). Weighting for unequal selection probabilities in multilevel models. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 23–56.
  • Rabe-Hesketh, S. and Skrondal, A. (2006). Multilevel modelling of complex survey data. J. Roy. Statist. Soc. Ser. A 169 805–827.
  • Raghunathan, T. E., Xie, D., Schenker, N., Parsons, V. L., Davis, W. W., Dodd, K. W. and Feuer, E. J. (2007). Combining information from two surveys to estimate county-level prevalence rates of cancer risk factors and screening. J. Amer. Statist. Assoc. 102 474–486.
  • Rao, J. N. K. (1992). Estimating totals and distribution functions using auxiliary information at the estimation stage. In Proceedings of the Workshop on Uses of Auxiliary Information in Surveys. Statistics Sweden.
  • Rao, J. N. K. (2003). Small Area Estimation. Wiley, Hoboken, NJ.
  • Rao, J. N. K. (2005). Interplay between sample survey theory and practice: An appraisal. Survey Methodol. 31 117–138.
  • Rao, J. N. K. and Ghangurde, P. D. (1972). Bayesian optimization in sampling finite populations. J. Amer. Statist. Assoc. 67 439–443.
  • Rao, J. N. K. and Wu, C. F. J. (1987). Methods for standard errors and confidence intervals from sample survey data: Some recent work. In Proceedings of the 46th Session of the International Statistical Institute, Vol. 3 (Tokyo, 1987) 52 5–21.
  • Rao, J. N. K. and Wu, C. (2009). Empirical likelihood methods. In Handbook of Statistics—Sample Surveys: Inference and Analysis 29B (D. Pfeffermann and C. R. Rao, eds.) 189–208. North-Holland, Amsterdam.
  • Rao, J. N. K. and Wu, C. (2010). Bayesian pseudo empirical likelihood intervals for complex surveys. J. Roy. Statist. Soc. Ser. B 72 533–544.
  • Rao, J. N. K., Jocelyn, W. and Hidiroglou, M. A. (2003). Confidence interval coverage probabilities for regression estimators in uni-phase and two-phase sampling. J. Off. Statist. 19 17–30.
  • Rao, J. N. K., Verret, F. and Hidiroglou, M. A. (2010). A weighted estimating equations approach to inference for two-level models from survey data. In Proc. Survey Sec. Statistical Society of Canada Annual Meeting. May 2010, Québec, Canada.
  • Rao, J. N. K., Hidiroglou, M., Yung, W. and Kovacevic, M. (2010). Role of weights in descriptive and analytical inference from survey data: An overview. J. Ind. Soc. Agric. Statist. 64 129–135.
  • Reiter, J. P., Raghunathan, T. E. and Kinney, S. K. (2006). The importance of modeling the sampling design in multiple imputation for missing data. Survey Methodol. 32 143–149.
  • Robinson, P. M. and Särndal, C.-E. (1983). Asymptotic properties of the generalized regression estimator in probability sampling. Sankhyā Ser. B 45 240–248.
  • Royall, R. M. (1968). An old approach to finite population sampling theory. J. Amer. Statist. Assoc. 63 1269–1279.
  • Royall, R. M. (1970). On finite population sampling theory under certain linear regression models. Biometrika 57 377–387.
  • Royall, R. M. and Pfeffermann, D. (1982). Balanced samples and robust Bayesian inference in finite population sampling. Biometrika 69 401–409.
  • Rubin, D. B. (1981). The Bayesian bootstrap. Ann. Statist. 9 130–134.
  • Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. Wiley, New York.
  • Sarndal, C.-E. (2007). The calibration approach in survey theory and practice. Survey Methodol. 33 99–119.
  • Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer, New York.
  • Scott, A. J. and Smith, T. M. F. (1969). Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 76 681–689.
  • Sedransk, J. (1977). Sampling problems in the estimation of the money supply. J. Amer. Statist. Assoc. 72 516–521.
  • Sedransk, J. (2008). Assessing the value of Bayesian methods for inference about finite population quantities. J. Off. Statist. 24 495–506.
  • Sinharay, S. and Stern, H. S. (2003). Posterior predictive model checking in hierarchical models. J. Statist. Plann. Inference 111 209–221.
  • Smith, T. M. F. (1994). Sample surveys 1975-90; an age of reconciliation. Int. Statist. Rev. 62 5–34.
  • Valliant, R., Dorfman, A. H. and Royall, R. M. (2000). Finite Population Sampling and Inference: A Prediction Approach. Wiley-Interscience, New York.
  • Woodruff, R. S. (1952). Confidence intervals for medians and other position measures. J. Amer. Statist. Assoc. 47 635–646.
  • Wu, C. and Rao, J. N. K. (2006). Pseudo-empirical likelihood ratio confidence intervals for complex surveys. Canad. J. Statist. 34 359–375.
  • You, Y. and Rao, J. N. K. (2002). Small area estimation using unmatched sampling and linking models. Canad. J. Statist. 30 3–15.
  • Zheng, H. and Little, R. J. A. (2003). Penalized spline model-based estimation of the finite populations total from probability-proportional-to-size samples. J. Off. Statist. 19 99–117.
  • Zheng, H. and Little, R. J. A. (2005). Inference for the population total from probability proportional-to-size samples based on predictions from a penalized spline nonparametric model. J. Off. Statist. 21 1–20.

See also

  • Discussion of: Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal by J. N. K. Rao.
  • Discussion of: Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal by J. N. K. Rao.
  • Discussion of: Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal by J. N. K. Rao.
  • Rejoinder: Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal.