Privacy Protection via Adversarial Examples

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



Machine learning is increasingly exploited by attackers to perform automated, large-scale inference attacks. For instance, in attribute inference attacks, an attacker can use a machine learning classifier to predict a target user's private, sensitive attributes (e.g., gender, political view) via the public data shared by the user (e.g., friendships, page likes). In membership inference attacks, given the confidence score vector produced by a target classifier for a data sample, an attacker can leverage a machine learning classifier to infer whether the data sample belongs to the training dataset of the target classifier. Those inference attacks pose severe security and privacy concerns to users in various web and mobile applications as well as real-world systems. Existing defenses are either computationally intractable or achieve sub-optimal privacy-utility tradeoffs.

In the first part of this dissertation, we develop a new attribute inference attack to infer user private attributes in online social networks. Then, we will discuss how to leverage adversarial examples to defend against inference attacks. Our key observation is that machine learning classifiers are vulnerable to adversarial examples. In particular, given an input, we can add some carefully crafted noise to the input such that a machine learning classifier makes the desired output. In the second part of this dissertation, we present AttriGuard, a practical defense against attribute inference attacks. Our AttriGuard turns users' public data into adversarial examples such that an attacker cannot correctly infer users' private attributes. Our experimental results on a real-world dataset showed that AttriGuard achieves a better privacy-utility tradeoff than existing defenses, including game-theoretic defense and (local) differential privacy. In the third part of this dissertation, we propose MemGuard, the first defense against membership inference attacks with formal utility guarantees. MemGuard turns confidence score vectors into adversarial examples to mislead an attacker's machine learning classifier. Our results showed that MemGuard can (provably) preserve the classification accuracy of a machine learning classifier while protecting its training data privacy. Adversarial examples are often viewed as offensive techniques that compromise the security of machine learning. We show that adversarial examples can also be used for privacy protection. Finally, we briefly discuss our work on defending against adversarial examples with provable robustness guarantees.





Jia, Jinyuan (2022). Privacy Protection via Adversarial Examples. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.