Differentially Private Counts with Additive Constraints
Differential privacy is a rigorous mathematical definition of privacy, which guarantees that data analysis should not be able to distinguish whether any individual is in or out of the dataset. To reduce disclosure risks, statistical agencies and other organizations can release noisy counts that satisfy differential privacy. In some contexts, the released counts need to satisfy additive constraints; for example, the released value of a total should equal the sum of the released values of its components. In this thesis, I present a simple post-processing procedure for satisfying such additive constraints. The basic idea is (i) compute approximate posterior modes or find posterior samples of the true counts given the noisy counts, (ii) construct a multinomial distribution with trial size equal to the posterior mode or the posterior draw of the total and probability vector equal to fractions derived from the posterior modes or the posterior draws of the components, and (iii) find and release a mode or samples of this multinomial distribution. I also present an approach for making Bayesian inferences about the true counts given these post-processed, differentially private counts. I illustrate these methods using simulations.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Masters Theses
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info