Dynamic Deep Learning Acceleration with Co-Designed Hardware Architecture

Hanson, Edward Thor

Dynamic Deep Learning Acceleration with Co-Designed Hardware Architecture

View / Download5.56 MB

Date

2023

Authors

Hanson, Edward Thor

Advisors

Chen, Yiran

Repository Usage Stats

5
views

34
downloads

Abstract

Recent advancements in Deep Learning (DL) hardware target the training and inference of static DL models, thus simultaneously achieving high runtime performance and efficiency.However, dynamic DL models are seen as the next step in further pushing the accuracy-performance tradeoff of DL inference and training in our favor; by reshaping the model's parameters or structure based on the input, dynamic DL models have the potential to boost accuracy while introducing marginal computation cost. As the field of DL progresses towards dynamic models, much of the advancements in DL accelerator design are eclipsed by data movement-related bottlenecks introduced by unpredictable memory access patterns and computation flow. Additionally, designing hardware for every niche task is inefficient due to the high cost of developing new hardware. Therefore, we must carefully design DL hardware and software stack to support future, dynamic DL models by emphasizing flexibility and generality without sacrificing end-to-end performance and efficiency.

This dissertation targets algorithmic-, hardware-, and software-level optimizations to optimize DL systems.Starting from the algorithm level, the robust nature of DNNs is exploited to reduce computational and data movement demand. At the hardware level, dynamic hardware mechanisms are investigated to better serve a broad range of impactful future DL workloads. At the software level, statistical patterns of dynamic models are leveraged to enhance the performance of offline and online scheduling strategies. Success of this research is measured by considering all key metrics associated with DL and DL acceleration: inference latency and accuracy, training throughput, peak memory occupancy, area efficiency, and energy efficiency.

Type

Dissertation

Department

Electrical and Computer Engineering

Subjects

Computer engineering, Accelerators, Deep learning, Dynamic models, Workload Scheduling

Permalink

https://hdl.handle.net/10161/30281

Rights

https://creativecommons.org/licenses/by-nc-nd/4.0/

Citation

Hanson, Edward Thor (2023). Dynamic Deep Learning Acceleration with Co-Designed Hardware Architecture. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/30281.

Collections

Dissertations

Full item page

Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.

Dynamic Deep Learning Acceleration with Co-Designed Hardware Architecture

Date

Authors

Advisors

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

Abstract

Type

Department

Description

Provenance

Subjects

Citation

Permalink

Rights

Citation

Collections