## Bayesian Analysis

### Bayesian Estimation Under Informative Sampling with Unattenuated Dependence

#### Abstract

An informative sampling design leads to unit inclusion probabilities that are correlated with the response variable of interest. However, multistage sampling designs may also induce higher order dependencies, which are ignored in the literature when establishing consistency of estimators for survey data under a condition requiring asymptotic independence among the unit inclusion probabilities. This paper constructs new theoretical conditions that guarantee that the pseudo-posterior, which uses sampling weights based on first order inclusion probabilities to exponentiate the likelihood, is consistent not only for survey designs which have asymptotic factorization, but also for survey designs that induce residual or unattenuated dependence among sampled units. The use of the survey-weighted pseudo-posterior, together with our relaxed requirements for the survey design, establish a wide variety of analysis models that can be applied to a broad class of survey data sets. Using the complex sampling design of the National Survey on Drug Use and Health, we demonstrate our new theoretical result on multistage designs characterized by a cluster sampling step that expresses within-cluster dependence. We explore the impact of multistage designs and order based sampling.

#### Article information

Source
Bayesian Anal., Volume 15, Number 1 (2020), 57-77.

Dates
First available in Project Euclid: 4 January 2019

https://projecteuclid.org/euclid.ba/1546570987

Digital Object Identifier
doi:10.1214/18-BA1143

Mathematical Reviews number (MathSciNet)
MR4050877

Zentralblatt MATH identifier
07169608

#### Citation

Williams, Matthew R.; Savitsky, Terrance D. Bayesian Estimation Under Informative Sampling with Unattenuated Dependence. Bayesian Anal. 15 (2020), no. 1, 57--77. doi:10.1214/18-BA1143. https://projecteuclid.org/euclid.ba/1546570987

#### References

• Binder, D. A. (1983). “On the variances of asymptotically normal estimators from complex surveys.” International Statistical Review, 51: 279–92.
• Breslow, N. E. and Wellner, J. A. (2007). “Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression.” Scandinavian Journal of Statistics, 34(1): 86–102.
• Brewer, K. (1975). “A simple procedure for $\pi$pswor.” Australian Journal of Statistics, 17: 166–172.
• Carpenter, B. (2015). “Stan: A Probabilistic Programming Language.” Journal of Statistical Software, 76(1).
• Center for Behavioral Health Statistics and Quality (2015a). “Section 1: Adult Mental Health Tables.” In 2014 National Survey on Drug Use and Health: Mental Health Detailed Tables. Rockville, MD: Substance Abuse and Mental Health Services Administration.
• Center for Behavioral Health Statistics and Quality (2015b). “Section 2: Tobacco Product and Alcohol Use Tables.” In 2014 National Survey on Drug Use and Health: Detailed Tables. Rockville, MD: Substance Abuse and Mental Health Services Administration.
• Chambers, R. and Skinner, C. (2003). Analysis of Survey Data. Wiley Series in Survey Methodology. Wiley.
• Gelman, A., Hwang, J., and Vehtari, A. (2014). “Understanding predictive information criteria for Bayesian models.” Statistics and Computing, 24(6): 997–1016.
• Ghosal, S., Ghosh, J. K., and Vaart, A. W. V. D. (2000). “Convergence rates of posterior distributions.” The Annals of Statistics, 28(2): 500–531.
• Ghosal, S. and van der Vaart, A. (2007). “Convergence rates of posterior distributions for noniid observations.” The Annals of Statistics, 35(1): 192–223.
• Godambe, V. P. and Thompson, M. E. (1986). “Parameters of super populations and survey population: their relationship and estimation.” International Statistical Review, 54: 37–59.
• Heeringa, S. G., West, B. T., and Berglund, P. A. (2010). Applied Survey Data Analysis. Chapman and Hall/CRC.
• Holt, D., Smith, T. M. F., and Winter, P. D. (1980). “Regression Analysis of Data from Complex Surveys.” Journal of the Royal Statistical Society. Series A (General), 143(4): 474–487.
• Isaki, C. T. and Fuller, W. A. (1982). “Survey Design Under the Regression Superpopulation Model.” Journal of the American Statistical Association, 77: 89–96.
• Kish, L. and Frankel, M. R. (1974). “Inference from complex samples (with discussion).” Journal of the Royal Statistical Society, Series B, 36: 1–37.
• Morton, K. B., Aldworth, J., Hirsch, E. L., Martin, P. C., and Shook-Sa, B. E. (2016). “Section 2, Sample Design Report.” In 2014 National Survey on Drug Use and Health: Methodological Resource Book. Rockville, MD: Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration.
• Pfeffermann, D., Krieger, A., and Rinott, Y. (1998). “Parametric distributions of complex survey data under informative probability sampling.” Statistica Sinica, 8, 1087–1114 (1998).
• Rao, J. N. K., Wu, C. F. J., and Yue, K. (1992). “Some Recent Work on Resampling Methods for Complex Surveys.” Survey Methodology, 18: 209–217.
• Savitsky, T. D. and Toth, D. (2016). “Bayesian estimation under informative sampling.” Electronic Journal of Statistics, 10(1): 1677–1708.
• Toth, D. and Eltinge, J. L. (2011). “Building consistent regression trees from complex sample data.” Journal of the American Statistical Association, 106(496): 1626–1636.
• Wang, H., Zhu, R., and Ma, P. (2018). “Optimal subsampling for large sample logistic regression.” Journal of the American Statistical Association, 113(522): 829–844.
• Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. URL http://ggplot2.org
• Williams, M. R. and Savitsky, T. D. (2018). “Bayesian pairwise estimation under dependent informative sampling.” Electronic Journal of Statistics, 12(1): 1631–1661.
• Williams, M. R. and Savitsky, T. D. (2019). “Supplementary Material for “Bayesian Estimation Under Informative Sampling with Unattenuated Dependence”.” Bayesian Analysis.
• Yi, G. Y., Rao, J. N. K., and Li, H. (2016). “A Weighted Composite Likelihood Approach for Analysis of Survey Data under Two-level Models.” Statistica Sinica, 26: 569–587.

#### Supplemental materials

• Appendices for “Bayesian Estimation Under Informative Sampling with Unattenuated Dependence”. Supplementary appendices are provided online to support the proof of Theorem 1 and to demonstrate example code and MCMC diagnostics.