Abstract
Confidence intervals are a fundamental tool for quantifying the uncertainty of parameters of interest. With the increase of data privacy awareness, developing a private version of confidence intervals has gained growing attention from both statisticians and computer scientists. Differential privacy is a state-of-the-art framework for analyzing privacy loss when releasing statistics computed from sensitive data. Recent work has been done around differentially private confidence intervals, yet to the best of our knowledge, rigorous methodologies on differentially private confidence intervals in the context of survey sampling have not been studied. In this paper, we propose three differentially private algorithms for constructing confidence intervals for proportions under stratified random sampling. We articulate two variants of differential privacy that make sense for data from stratified sampling designs, analyzing each of our algorithms within one of these two variants. We establish analytical privacy guarantees and asymptotic properties of the estimators. In addition, we conduct simulation studies to evaluate the proposed private confidence intervals, and two applications to the 1940 Census data are provided.
Funding Statement
The research presented in this paper was supported by the U.S. Census Bureau Cooperative Agreement CB20ADR0160001.
Acknowledgments
We are grateful for helpful conversations with and comments from (in no particular order) Rolando Rodriguez, Brian Finley, Jörg Drechsler, Gary Benedetto, Michael Freiman, and Justin Doty.
Citation
Shurong Lin. Mark Bun. Marco Gaboardi. Eric D. Kolaczyk. Adam Smith. "Differentially private confidence intervals for proportions under stratified random sampling." Electron. J. Statist. 18 (1) 1455 - 1494, 2024. https://doi.org/10.1214/24-EJS2234
Information