Answering and Explaining SQL Queries Privately
Date
2022
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
Data privacy has been receiving an increasing amount of attention in recent years. While large-scale personal information is collected for scientific research and commercial products, a privacy breach is not acceptable as a trade-off. In the last decade, differential privacy has become a gold standard to protect data privacy and has been applied in many organizations. Past work focused on building a differentially private SQL query answering system as a building block for wider applications. However, answering counting queries with joins under differential privacy appears as a challenge. The join operator allows any user to have an unbounded impact on the query result, which impedes hiding the existence of a single user by differential privacy. On the other hand, the introduction of differential privacy to the query answering also prevents the users from understanding the query results correctly, since she needs to distinguish the effect of differential privacy from the contribution of data.
In this thesis, we study two problems about answering and explaining SQL queries privately. First, we present efficient algorithms to compute local sensitivities of counting queries with joins, which is an important premise for answering these queries under differential privacy. We track the sensitivity contributed by each tuple, based on which we propose a truncation mechanism that answers counting queries with joins privately with high utility. Second, we propose a formal framework DPXPlain, a three-phase framework that allows users to get explanations for group-by COUNT/SUM/AVG query results while preserving DP. We utilize confidence intervals to help users understand the uncertainty in the query results introduced by differential privacy, and further provide top-k explanations under differential privacy to explain the contribution of data to the results.
Type
Department
Description
Provenance
Citation
Permalink
Citation
Tao, Yuchao (2022). Answering and Explaining SQL Queries Privately. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/25782.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.