Open Access
Translator Disclaimer
September 2010 Assessing the protection provided by misclassification-based disclosure limitation methods for survey microdata
Natalie Shlomo, Chris Skinner
Ann. Appl. Stat. 4(3): 1291-1310 (September 2010). DOI: 10.1214/09-AOAS317


Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect the confidentiality of respondents. There is a need for valid and practical ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are applied in numerical experiments based upon census data from the United Kingdom which are subject to two perturbation techniques: data swapping (random and targeted) and the post randomization method. Some simplifying approximations to the measure of risk are found to work well in capturing the impacts of these techniques. These approximations provide simple extensions of existing risk assessment methods based upon Poisson log-linear models. A numerical experiment is also undertaken to assess the impact of multivariate misclassification with an increasing number of identifying variables. It is found that the misclassification dominates the usual monotone increasing relationship between this number and risk so that the risk eventually declines, implying less sensitivity of risk to choice of identifying variables. The methods developed in this paper may also be used to obtain more realistic assessments of risk which take account of the kinds of measurement and other nonsampling errors commonly arising in surveys.


Download Citation

Natalie Shlomo. Chris Skinner. "Assessing the protection provided by misclassification-based disclosure limitation methods for survey microdata." Ann. Appl. Stat. 4 (3) 1291 - 1310, September 2010.


Published: September 2010
First available in Project Euclid: 18 October 2010

zbMATH: 1202.62011
MathSciNet: MR2758329
Digital Object Identifier: 10.1214/09-AOAS317

Keywords: data swapping , Disclosure risk , identification risk , log linear model , measurement error , post randomization method

Rights: Copyright © 2010 Institute of Mathematical Statistics


Vol.4 • No. 3 • September 2010
Back to Top