Translator Disclaimer
October 2021 The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy
T. Tony Cai, Yichen Wang, Linjun Zhang
Author Affiliations +
Ann. Statist. 49(5): 2825-2850 (October 2021). DOI: 10.1214/21-AOS2058

Abstract

Privacy-preserving data analysis is a rising challenge in contemporary statistics, as the privacy guarantees of statistical methods are often achieved at the expense of accuracy. In this paper, we investigate the tradeoff between statistical accuracy and privacy in mean estimation and linear regression, under both the classical low-dimensional and modern high-dimensional settings. A primary focus is to establish minimax optimality for statistical estimation with the (ε,δ)-differential privacy constraint. By refining the “tracing adversary” technique for lower bounds in the theoretical computer science literature, we improve existing minimax lower bound for low-dimensional mean estimation and establish new lower bounds for high-dimensional mean estimation and linear regression problems. We also design differentially private algorithms that attain the minimax lower bounds up to logarithmic factors. In particular, for high-dimensional linear regression, a novel private iterative hard thresholding algorithm is proposed. The numerical performance of differentially private algorithms is demonstrated by simulation studies and applications to real data sets.

Funding Statement

The research of T. Cai was supported in part by NSF Grants DMS-1712735 and DMS-2015259 and NIH Grants R01-GM129781 and R01-GM123056. The research of L. Zhang was supported in part by NSF Grant NSF DMS-2015378.

Acknowledgments

We would like to thank the Associate Editor and referees for their helpful suggestions and comments that lead to a great improvement of the paper. We also thank Chi-Yun Wu for advice on Section 6.

Citation

Download Citation

T. Tony Cai. Yichen Wang. Linjun Zhang. "The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy." Ann. Statist. 49 (5) 2825 - 2850, October 2021. https://doi.org/10.1214/21-AOS2058

Information

Received: 1 February 2020; Revised: 1 October 2020; Published: October 2021
First available in Project Euclid: 12 November 2021

Digital Object Identifier: 10.1214/21-AOS2058

Subjects:
Primary: 62F30
Secondary: 62F12 , 62J05

Keywords: differential privacy , High-dimensional data , Linear regression , Mean estimation , Minimax optimality

Rights: Copyright © 2021 Institute of Mathematical Statistics

JOURNAL ARTICLE
26 PAGES

This article is only available to subscribers.
It is not available for individual sale.
+ SAVE TO MY LIBRARY

SHARE
Vol.49 • No. 5 • October 2021
Back to Top