Differentially Private Verification ofPredictions from Synthetic Data

Abstract

When data are confidential, one approach for releasing public available files is to make synthetic data, i.e, data simulated from statistical models estimated on the confidential data. Given access only to synthetic data, users have no way to tell if the synthetic data can preserve the adequacy of their analysis. Thus, I present methods that can help users to make such assessments automatically while controlling the information disclosure risks in the confidential data. There are three verification methods presented in this thesis: differentially private prediction tolerance intervals, differentially private prediction histogram, and differentially private Kolmogorov-Smirnov test. I use simulation to illustrate these prediction verification methods.

Description

Provenance

Citation

Citation

Yu, Haoyang (2017). Differentially Private Verification ofPredictions from Synthetic Data. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/16424.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.