Dynamic Deep Learning Acceleration with Co-Designed Hardware Architecture

Loading...
Thumbnail Image

Date

2023

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

0
views
2
downloads

Abstract

Recent advancements in Deep Learning (DL) hardware target the training and inference of static DL models, thus simultaneously achieving high runtime performance and efficiency.However, dynamic DL models are seen as the next step in further pushing the accuracy-performance tradeoff of DL inference and training in our favor; by reshaping the model's parameters or structure based on the input, dynamic DL models have the potential to boost accuracy while introducing marginal computation cost. As the field of DL progresses towards dynamic models, much of the advancements in DL accelerator design are eclipsed by data movement-related bottlenecks introduced by unpredictable memory access patterns and computation flow. Additionally, designing hardware for every niche task is inefficient due to the high cost of developing new hardware. Therefore, we must carefully design DL hardware and software stack to support future, dynamic DL models by emphasizing flexibility and generality without sacrificing end-to-end performance and efficiency.

This dissertation targets algorithmic-, hardware-, and software-level optimizations to optimize DL systems.Starting from the algorithm level, the robust nature of DNNs is exploited to reduce computational and data movement demand. At the hardware level, dynamic hardware mechanisms are investigated to better serve a broad range of impactful future DL workloads. At the software level, statistical patterns of dynamic models are leveraged to enhance the performance of offline and online scheduling strategies. Success of this research is measured by considering all key metrics associated with DL and DL acceleration: inference latency and accuracy, training throughput, peak memory occupancy, area efficiency, and energy efficiency.

Description

Provenance

Citation

Citation

Hanson, Edward Thor (2023). Dynamic Deep Learning Acceleration with Co-Designed Hardware Architecture. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/30281.

Collections


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.