Bayesian and Frequentist Intervals under Differential Privacy for Binomial Proportions
| dc.contributor.advisor | Reiter, Jerome P. | |
| dc.contributor.author | Kao, Hsuan-Chen | |
| dc.date.accessioned | 2025-07-02T19:07:54Z | |
| dc.date.available | 2025-07-02T19:07:54Z | |
| dc.date.issued | 2025 | |
| dc.department | Statistical Science | |
| dc.description.abstract | This paper compares and proposes interval inference methods for binomial proportions, true $p$, under differential privacy (DP) with the Laplace mechanism ($\varepsilon$-DP) and the discrete Gaussian mechanism (Rényi DP). We first assess the frequentist approaches, including adjusted plug-in Wald and Wilson intervals. Notably, the Wilson interval traditionally served as a more robust alternative to Wald in terms of the out-of-bound problem, which is no longer a viable substitute in this setting after adding in the Laplace noise (or discrete Gaussian noise and others) due to persistent out-of-bound issues. Additionally, we propose three alternatives: First, $\varepsilon$-DP Bayesian credible intervals with uniform prior and Jeffrey prior—derived from the posterior distribution of noisy observations $f(p | \hat{p}^*)$. Second is an $\varepsilon$-DP sampling-based interval, which is a practical alternative to the Bayesian method without MCMC. It is less complex and achieves high coverage, though the intervals can be slightly longer and somewhat conservative. Third is the $\varepsilon$-DP exact interval, mainly motivated by Clopper-Pearson's method, which is straightforward and easy to interpret. Lastly, for the Rényi DP mechanism, we only demonstrate the Bayesian mechanism in this thesis, as it provides a better balance between achieving the nominal coverage rate and avoiding overly conservative interval lengths, based on our evaluation of the $\varepsilon$-DP with the Laplace mechanism. To bring the informative evaluation, we discuss the Laplace noise and discrete Gaussian noise controlled by the privacy parameter $\varepsilon$. We examine the impact of the specific pairing of varying noise levels $\varepsilon$ and the binomial proportions $p$ on the accuracy and coverage of these intervals. We aim to emphasize the trade-offs between privacy and statistical inference precision in differentially private data dissemination. | |
| dc.identifier.uri | ||
| dc.rights.uri | ||
| dc.subject | Statistics | |
| dc.subject | Bayesian Interval | |
| dc.subject | Binomial Proportions | |
| dc.subject | Differential Privacy | |
| dc.title | Bayesian and Frequentist Intervals under Differential Privacy for Binomial Proportions | |
| dc.type | Master's thesis | |
| duke.embargo.months | 0.01 | |
| duke.embargo.release | 2025-07-08 |