Efficient Neural Network Based Systems on Mobile and Cloud Platforms

dc.contributor.advisor

Chen, Yiran

dc.contributor.advisor

Li, Hai

dc.contributor.author

Mao, Jiachen

dc.date.accessioned

2020-09-18T16:00:31Z

dc.date.available

2020-09-18T16:00:31Z

dc.date.issued

2020

dc.department

Electrical and Computer Engineering

dc.description.abstract

In recent years, machine learning, especially neural networks arouses unprecedented influence in both academia and industry.

The reason lies in the state-of-the-art performance of neural networks on many critical applications such as object detection, translation, and games. However, the deployment of neural network models on resource-constrained devices (e.g. edge devices) is challenged by their heavy memory and computing cost during execution. Many efforts have been done in previous literature for efficient execution of neural networks, including the perspectives of hardware, software, and algorithm.

My research focus during my Ph.D. study is mainly on software, and algorithm targeting at mobile platforms. More specifically, we emphasize the system design, system optimization, and model compression of neural networks for better mobile user experience. From the system design perspective, we first propose MoDNN – a local distributed mobile computing system for DNN testing. MoDNN can partition already trained DNN models onto several mobile devices to accelerate DNN computations by alleviating device-level computing cost and memory usage. Two model partition schemes are also designed to minimize non-parallel data delivery time, including both wakeup time and transmission time. Then, we propose AdaLearner – an adaptive local distributed mobile computing system for DNN training. To exploit the potential of our system, we adapt the neural networks training phase to mobile device-wise resources and fiercely decrease the transmission overhead for better system scalability. From the system optimization perspective, we propose MobiEye, a cloud-based video detection system optimized for deployment in real-time mobile applications. MobiEye is based on a state-of-the-art video detection framework called Deep Feature Flow (DFF). MobiEye optimizes DFF by three system-level optimization methods. From the model compression perspective, we propose Tprune, a model analyzing and pruning framework for Transformer. In TPrune, we first proposed Block-wise Structured Sparsity Learning (BSSL) to analyze Transformer model property. Then, based on the characters derived from BSSL, we apply Structured Hoyer Square (SHS) to derive the final compressed models. The realization of the projects during my PhD study could contribute to the current research on efficient neural network execution and thus result in more user-friendly and smart applications on edge devices for more users.

dc.identifier.uri

https://hdl.handle.net/10161/21492

dc.subject

Computer engineering

dc.title

Efficient Neural Network Based Systems on Mobile and Cloud Platforms

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mao_duke_0066D_15849.pdf
Size:
12.96 MB
Format:
Adobe Portable Document Format

Collections