Efficient Deep Learning for Image Applications

Thumbnail Image




Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



Breakthrough of deep learning (DL) has greatly promoted development of machine learning in numerous academic disciplines and industries in recent years.

A subsequent concern, which is frequently raised by multidisciplinary researchers, software developers, and machine learning end users, is inefficiency of DL methods: intolerable training and inference time, exhausted computing resources, and unsustainable power consumption.

To tackle the inefficiency issues, tons of DL efficiency methods have been proposed to improve efficiency without sacrificing prediction accuracy of a specified application such as image classification and visual object detection.

However, we suppose that the traditional DL efficiency methods are not sufficiently flexible or adaptive to meet requirement of practical usage scenarios, based on two observations.

First, most of the traditional methods adopt an objective "no accuracy loss for a specified application", while the objective cannot cover considerable scenarios.

For example, to meet diverse user needs, a public cloud platform should provide an efficient and multipurpose DL method instead of focusing on an application only.

Second, most of the traditional methods adopt model compression and quantization as efficiency enhancement strategies, while these two strategies are severely degraded for a certain number of scenarios.

For example, for embedded deep neural networks (DNNs), significant architecture change and quantization may severely weaken customized hardware accelerators designed for predefined DNN operators and precision.

In this dissertation, we will investigate three popular usage scenarios and correspondingly propose our DL efficiency methods: versatile model efficiency, robust model efficiency, and processing-step efficiency.

The first scenario is requiring a DL method to achieve model efficiency and versatility.

The model efficiency is to design a compact deep neural network, while the versatility is to get satisfactory prediction accuracy on multiple applications.

We propose a compact DNN by integrating shape information into a newly designed module Conv-M, to tackle an issue that previous compact DNNs cannot achieve matched level of accuracy on image classification and unsupervised domain adaptation.

Our method can benefit software developers, since they can directly replace an original single-purpose DNN with our versatile one in their programs.

The second scenario is requiring a DL method to achieve model efficiency and robustness.

The robustness is to get satisfactory prediction accuracy for certain categories of samples.

These samples are critical but often wrongly predicted by previous methods.

We propose a fast training method based on simultaneous adaptive filter reuse (dynamic compression) and neuron-level robustness enhancement, to improve accuracy on self-driving motion prediction, especially the accuracy on night driving samples.

Our method can benefit algorithm researchers who are proficient in mathematically exploring loss functions but not skilled in empirically constructing efficient sub-modules of DNNs, since our dynamic compression does not require expertise on the sub-modules of DNNs.

The third scenario is requiring inference speed of a DL method to be fast without significantly changing DNN architecture and adopting quantization.

We propose a fast photorealistic style transfer method by removing time-consuming smoothing step during inference and introducing spatially coherent content-style preserving loss during training.

For computer vision engineers who struggle to combine DL efficiency approaches, our method provides a different candidate efficiency method compared to popular architecture tailoring and quantization.





Wu, Chunpeng (2020). Efficient Deep Learning for Image Applications. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/21458.


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.