Bayesian Analysis

Bayesian Estimation Under Informative Sampling with Unattenuated Dependence

Matthew R. Williams and Terrance D. Savitsky

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access

Abstract

An informative sampling design leads to unit inclusion probabilities that are correlated with the response variable of interest. However, multistage sampling designs may also induce higher order dependencies, which are ignored in the literature when establishing consistency of estimators for survey data under a condition requiring asymptotic independence among the unit inclusion probabilities. This paper constructs new theoretical conditions that guarantee that the pseudo-posterior, which uses sampling weights based on first order inclusion probabilities to exponentiate the likelihood, is consistent not only for survey designs which have asymptotic factorization, but also for survey designs that induce residual or unattenuated dependence among sampled units. The use of the survey-weighted pseudo-posterior, together with our relaxed requirements for the survey design, establish a wide variety of analysis models that can be applied to a broad class of survey data sets. Using the complex sampling design of the National Survey on Drug Use and Health, we demonstrate our new theoretical result on multistage designs characterized by a cluster sampling step that expresses within-cluster dependence. We explore the impact of multistage designs and order based sampling.

Article information

Source
Bayesian Anal., Advance publication (2018), 21 pages.

Dates
First available in Project Euclid: 4 January 2019

Permanent link to this document
https://projecteuclid.org/euclid.ba/1546570987

Digital Object Identifier
doi:10.1214/18-BA1143

Subjects
Primary: 62D05: Sampling theory, sample surveys 62G20: Asymptotic properties

Keywords
cluster sampling stratification survey sampling sampling weights Markov chain Monte Carlo

Rights
Creative Commons Attribution 4.0 International License.

Citation

Williams, Matthew R.; Savitsky, Terrance D. Bayesian Estimation Under Informative Sampling with Unattenuated Dependence. Bayesian Anal., advance publication, 4 January 2019. doi:10.1214/18-BA1143. https://projecteuclid.org/euclid.ba/1546570987


Export citation

References

  • Binder, D. A. (1983). “On the variances of asymptotically normal estimators from complex surveys.” International Statistical Review, 51: 279–92.
  • Breslow, N. E. and Wellner, J. A. (2007). “Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression.” Scandinavian Journal of Statistics, 34(1): 86–102.
  • Brewer, K. (1975). “A simple procedure for $\pi$pswor.” Australian Journal of Statistics, 17: 166–172.
  • Carpenter, B. (2015). “Stan: A Probabilistic Programming Language.” Journal of Statistical Software, 76(1).
  • Center for Behavioral Health Statistics and Quality (2015a). “Section 1: Adult Mental Health Tables.” In 2014 National Survey on Drug Use and Health: Mental Health Detailed Tables. Rockville, MD: Substance Abuse and Mental Health Services Administration.
  • Center for Behavioral Health Statistics and Quality (2015b). “Section 2: Tobacco Product and Alcohol Use Tables.” In 2014 National Survey on Drug Use and Health: Detailed Tables. Rockville, MD: Substance Abuse and Mental Health Services Administration.
  • Chambers, R. and Skinner, C. (2003). Analysis of Survey Data. Wiley Series in Survey Methodology. Wiley.
  • Gelman, A., Hwang, J., and Vehtari, A. (2014). “Understanding predictive information criteria for Bayesian models.” Statistics and Computing, 24(6): 997–1016.
  • Ghosal, S., Ghosh, J. K., and Vaart, A. W. V. D. (2000). “Convergence rates of posterior distributions.” The Annals of Statistics, 28(2): 500–531.
  • Ghosal, S. and van der Vaart, A. (2007). “Convergence rates of posterior distributions for noniid observations.” The Annals of Statistics, 35(1): 192–223.
  • Godambe, V. P. and Thompson, M. E. (1986). “Parameters of super populations and survey population: their relationship and estimation.” International Statistical Review, 54: 37–59.
  • Heeringa, S. G., West, B. T., and Berglund, P. A. (2010). Applied Survey Data Analysis. Chapman and Hall/CRC.
  • Holt, D., Smith, T. M. F., and Winter, P. D. (1980). “Regression Analysis of Data from Complex Surveys.” Journal of the Royal Statistical Society. Series A (General), 143(4): 474–487.
  • Isaki, C. T. and Fuller, W. A. (1982). “Survey Design Under the Regression Superpopulation Model.” Journal of the American Statistical Association, 77: 89–96.
  • Kish, L. and Frankel, M. R. (1974). “Inference from complex samples (with discussion).” Journal of the Royal Statistical Society, Series B, 36: 1–37.
  • Morton, K. B., Aldworth, J., Hirsch, E. L., Martin, P. C., and Shook-Sa, B. E. (2016). “Section 2, Sample Design Report.” In 2014 National Survey on Drug Use and Health: Methodological Resource Book. Rockville, MD: Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration.
  • Pfeffermann, D., Krieger, A., and Rinott, Y. (1998). “Parametric distributions of complex survey data under informative probability sampling.” Statistica Sinica, 8, 1087–1114 (1998).
  • Rao, J. N. K., Wu, C. F. J., and Yue, K. (1992). “Some Recent Work on Resampling Methods for Complex Surveys.” Survey Methodology, 18: 209–217.
  • Savitsky, T. D. and Toth, D. (2016). “Bayesian estimation under informative sampling.” Electronic Journal of Statistics, 10(1): 1677–1708.
  • Toth, D. and Eltinge, J. L. (2011). “Building consistent regression trees from complex sample data.” Journal of the American Statistical Association, 106(496): 1626–1636.
  • Wang, H., Zhu, R., and Ma, P. (2018). “Optimal subsampling for large sample logistic regression.” Journal of the American Statistical Association, 113(522): 829–844.
  • Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. URL http://ggplot2.org
  • Williams, M. R. and Savitsky, T. D. (2018). “Bayesian pairwise estimation under dependent informative sampling.” Electronic Journal of Statistics, 12(1): 1631–1661.
  • Williams, M. R. and Savitsky, T. D. (2019). “Supplementary Material for “Bayesian Estimation Under Informative Sampling with Unattenuated Dependence”.” Bayesian Analysis.
  • Yi, G. Y., Rao, J. N. K., and Li, H. (2016). “A Weighted Composite Likelihood Approach for Analysis of Survey Data under Two-level Models.” Statistica Sinica, 26: 569–587.

Supplemental materials

  • Appendices for “Bayesian Estimation Under Informative Sampling with Unattenuated Dependence”. Supplementary appendices are provided online to support the proof of Theorem 1 and to demonstrate example code and MCMC diagnostics.