A Privacy Preserving Algorithm to Release Sparse High-dimensional Histograms

dc.contributor.advisor

Steorts, Rebecca C

dc.contributor.author

Li, Bai

dc.date.accessioned

2017-08-16T18:26:07Z

dc.date.available

2019-05-23T08:17:11Z

dc.date.issued

2017

dc.department

Statistical Science

dc.description.abstract

Differential privacy (DP) aims to design methods and algorithms that satisfy rigorous notions of privacy while simultaneously providing utility with valid statistical inference. More recently, an emphasis has been placed on combining notions of statistical utility with algorithmic approaches to address privacy risk in the presence of big data---with differential privacy emerging as a rigorous notion of risk. While DP provides strong guarantees for privacy, there are often tradeoffs regarding data utility and computational scalability. In this paper, we introduce a categorical data synthesizer that releases high-dimensional sparse histograms, illustrating its ability to overcome current limitations with data synthesizers in the current literature. Specifically, we combine a differential privacy algorithm---the stability based algorithm--- along with feature hashing, with allows for dimension reduction in terms of the histograms and Gibbs sampling. As a result, our proposed algorithm is differentially private, offers similar or better statistical utility and is scalable to large databases. In addition, we give an analytical result for the error caused by the stability based algorithm, which allows us to control the loss of utility. Finally, we study the behavior of our algorithm on both simulated and real data.

dc.identifier.uri

https://hdl.handle.net/10161/15252

dc.subject

Statistics

dc.subject

Computer science

dc.subject

Differential privacy

dc.subject

Methodology

dc.title

A Privacy Preserving Algorithm to Release Sparse High-dimensional Histograms

dc.type

Master's thesis

duke.embargo.months

21

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Li_duke_0066N_13993.pdf
Size:
400.45 KB
Format:
Adobe Portable Document Format

Collections