Given the public’s ever-increasing concerns about data confidentiality, in the near future statistical agencies may be unable or unwilling, or even may not be legally allowed, to release any genuine microdata—data on individual units, such as individuals or establishments. In such a world, an alternative dissemination strategy is remote access analysis servers, to which users submit requests for output from statistical models fit using the data, but are not allowed access to the data themselves. Analysis servers, however, are not free from the risk of disclosure, especially in the face of multiple, interacting queries. We describe these risks and propose quantifiable measures of risk and data utility that can be used to specify which queries can be answered and with what output. The risk–utility framework is illustrated for regression models.
"Data Dissemination and Disclosure Limitation in a World Without Microdata: A Risk–Utility Framework for Remote Access Analysis Servers." Statist. Sci. 20 (2) 163 - 177, May 2005. https://doi.org/10.1214/088342305000000043