Enable Intelligence on Billion Devices with Deep Learning
With the proliferation of edge computing and Internet of Things (IoT), billions of edge devices (e.g., smartphone, AR/VR headset, autonomous car, etc) are deployed in our daily life and constantly generating the gigantic amount of data at the network edge. Bringing deep learning to such huge volumes of data will boost many novel applications and services in edge ecosystem and fuel the continuous booming of artiﬁcial intelligence (AI). Driven by this motivation, there is an urgent need to push the AI frontier to the network edge in order to fully exploit big data residing on edge devices.
However, empowering edge intelligence with AI, especially deep learning, is technically challenging, due to the several critical challenges including privacy, efficiency, and performance.Conventional wisdom requires edge devices to transmit the data to cloud datacenters for training and inference. But moving a huge amount of data is prohibited by cost, high transmission delay, and privacy leakage. The emerging federated learning (FL) is a promising distributed learning paradigm that enables massive devices to collaboratively learn a machine learning model (e.g., deep neural network) without explicitly sharing data, and hence the privacy concerns caused by data sharing in the centralized learning can be mitigated. But FL is facing some critical challenges that hinder its deployments to edge devices, such as communication cost and data heterogeneity.
Once we obtain a learned machine learning model, the next step is to deploy the model for serving applications and services. One straightforward approach is to deploy the model on device to perform the inference locally. Unfortunately, on-device AI often suffers from poor performance because most AI applications requires high computational power, which is technically unaffordable for resource-constrained edge devices. Edge computing pushes the cloud services from the network core to the network edge, and hence bridging devices with edge servers can alleviate the computational cost of running AI models on device alone. However, such a collaborative deployment scheme will inevitably incur transmission delay and raise privacy concern due to data movement between devices and edge servers. For example, the device can send the features extracted from raw data (e.g., images) to the cloud where a pre-trained machine learning model is deployed, but these extracted features can still be exploited by attackers to recover raw data and to infer embedded private attributes (e.g., age, gender, etc.).
In this dissertation, I start with presenting a privacy-respecting data crowdsourcing framework for deep learning to address the privacy issue in centralized training. Then, I shift the setting from the centralized one to the decentralized environment, where three novel FL frameworks are proposed to jointly improve communication and computation efficiency while handling the heterogeneous data across devices. In addition to improving the learning on large-scale edge devices, I also design an efficient edge-assisted photorealistic video style transfer system for mobile phones by leveraging the collaboration between smartphones and the edge server. Besides, in order to mitigate the privacy concern caused by the data movement in the collaborative system, an adversarial training framework is proposed to prevent the adversary from reconstructing the raw data and inferring private attributes.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info