Accelerator Architectures for Deep Learning and Graph Processing

dc.contributor.advisor

Chen, Yiran

dc.contributor.advisor

Li, Hai

dc.contributor.author

Song, Linghao

dc.date.accessioned

2020-09-18T16:00:41Z

dc.date.available

2021-03-02T09:17:15Z

dc.date.issued

2020

dc.department

Electrical and Computer Engineering

dc.description.abstract

Deep learning and graph processing are two big-data applications and they are widely applied in many domains. The training of deep learning is essential for inference and has not yet been fully studied. With data forward, error backward, and gradient calculation, deep learning training is a more complicated process with higher computation and communication intensity. Distributing computations on multiple heterogeneous accelerators to achieve high throughput and balanced execution, however, remaining challenging. In this dissertation, I present AccPar, a principled and systematic method of determining the tensor partition for multiple heterogeneous accelerators for efficient training acceleration. Emerging resistive random access memory (ReRAM) is promising for processing in memory (PIM). For high-throughput training acceleration in ReRAM-based PIM accelerator, I present PipeLayer, an architecture for layer-wise pipelined parallelism. Graph processing is well-known for poor locality and high memory bandwidth demand. In conventional architectures, graph processing incurs a significant amount of data movements and energy consumption. I present GraphR, the first ReRAM-based graph processing accelerator which follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. Sparse matrix-vector multiplication (SpMV), a subset of graph processing, is the key computation in iterative solvers for scientific computing. The efficiently accelerating floating-point processing in ReRAM remains a challenge. In this dissertation, I present ReFloat, a data format, and a supporting accelerator architecture, for low-cost floating-point processing in ReRAM for scientific computing.

dc.identifier.uri

https://hdl.handle.net/10161/21507

dc.subject

Computer engineering

dc.subject

Computer science

dc.subject

Accelerators

dc.subject

Computer architecture

dc.subject

Deep learning

dc.subject

Graph Processing

dc.title

Accelerator Architectures for Deep Learning and Graph Processing

dc.type

Dissertation

duke.embargo.months

5.391780821917808

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Song_duke_0066D_15866.pdf
Size:
2.97 MB
Format:
Adobe Portable Document Format

Collections