Accelerator Architectures for Deep Learning and Graph Processing

Song, Linghao

Accelerator Architectures for Deep Learning and Graph Processing

dc.contributor.advisor	Chen, Yiran
dc.contributor.advisor	Li, Hai
dc.contributor.author	Song, Linghao
dc.date.accessioned	2020-09-18T16:00:41Z
dc.date.available	2021-03-02T09:17:15Z
dc.date.issued	2020
dc.department	Electrical and Computer Engineering
dc.description.abstract	Deep learning and graph processing are two big-data applications and they are widely applied in many domains. The training of deep learning is essential for inference and has not yet been fully studied. With data forward, error backward, and gradient calculation, deep learning training is a more complicated process with higher computation and communication intensity. Distributing computations on multiple heterogeneous accelerators to achieve high throughput and balanced execution, however, remaining challenging. In this dissertation, I present AccPar, a principled and systematic method of determining the tensor partition for multiple heterogeneous accelerators for efficient training acceleration. Emerging resistive random access memory (ReRAM) is promising for processing in memory (PIM). For high-throughput training acceleration in ReRAM-based PIM accelerator, I present PipeLayer, an architecture for layer-wise pipelined parallelism. Graph processing is well-known for poor locality and high memory bandwidth demand. In conventional architectures, graph processing incurs a significant amount of data movements and energy consumption. I present GraphR, the first ReRAM-based graph processing accelerator which follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. Sparse matrix-vector multiplication (SpMV), a subset of graph processing, is the key computation in iterative solvers for scientific computing. The efficiently accelerating floating-point processing in ReRAM remains a challenge. In this dissertation, I present ReFloat, a data format, and a supporting accelerator architecture, for low-cost floating-point processing in ReRAM for scientific computing.
dc.identifier.uri	https://hdl.handle.net/10161/21507
dc.subject	Computer engineering
dc.subject	Computer science
dc.subject	Accelerators
dc.subject	Computer architecture
dc.subject	Deep learning
dc.subject	Graph Processing
dc.title	Accelerator Architectures for Deep Learning and Graph Processing
dc.type	Dissertation
duke.embargo.months	5.391780821917808

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Song_duke_0066D_15866.pdf
Size:: 2.97 MB
Format:: Adobe Portable Document Format

Download

Collections

Dissertations