Processing-in-Memory Accelerators Toward Energy-Efficient Real-World Machine Learning

Loading...
Thumbnail Image
Limited Access
This item is unavailable until:
2025-03-08

Date

2024

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

24
views
0
downloads

Abstract

Artificial intelligence (AI) has permeated the real world, reaping unprecedented success. Numberless applications exploit machine learning (ML) technologies of big data and compute-intensive algorithms. Moreover, the aspiration of authentic machine intelligence moves computing toward the edge to handle complex tasks conventionally tailored for human beings. Along with the rapid development, the gap between the increasing resource requirements in ML and the restricted environments of edge engenders urgent attention to the challenges in efficiency. To resolve the gap, solutions across different disciplines in hardware are necessary beyond algorithm development.

Unfortunately, hardware development falls far behind because of heterogeneity. While the sensational advance of ML algorithms is a game-change of computing paradigms, conventional hardware unfits new paradigms due to fundamental limitations in its architecture and technology. The traditional architecture separating storage and computation is dreadfully inefficient for innumerable data processing and computing in algorithms, showing high power consumption and low performance. The realization of the fundamental limitations motivates efficient and non-conventional hardware accelerators.

As a new hardware paradigm, processing-in-memory accelerators (PIM) have brought significant expectations because of their intuitive effectiveness for the limitations of traditional hardware. PIM merges computing and processing units and saves resources for data and computations, pursuing non-heterogeneity and ultimately improving efficiency.Previous PIM accelerators have shown promising outcomes with high-performance computing, particularly thanks to emerging memories under the name of memristor.

Despite its motivation for non-heterogeneity, PIM-based designs couldn't fully escape from heterogeneity causing inefficiency with high costs. While emerging memories provide revolutions at device and circuit levels, PIM at higher levels struggles with various components in systems (horizontal heterogeneity). Furthermore, PIM is holistically designed across hierarchical levels of heterogeneity (vertical heterogeneity), which complicates its design with efficiency.Even robustness could be significantly influenced by heterogeneity.

Confronting the challenges in heterogeneity, efficiency, and robustness, my research has cultivated PIM hardware through cross-layer designs for practically efficient ML acceleration. Specifically, focusing on architecture/system-level innovations, I have pioneered novel 3D architectures and systemic paradigms, which provide a strong foundation for future computing. For ML acceleration, I have proposed new methodologies to efficiently operate 3D architecture and a novel dataflow with a new 3D design for energy efficiency by pursuing non-heterogeneity. The innovations have been examined through rigorous hardware experiments, and their practical efficiency has been proved with a fabricated chip for seizure classification, a real-world application. According to the need for future ML, my research is evolving to accomplish robustness in hardware as ML platforms. In this dissertation, I summarize the research impacts based on my diverse design experiences, spanning architecture and system design to chip fabrication.

Description

Provenance

Citation

Citation

Kim, Bokyung (2024). Processing-in-Memory Accelerators Toward Energy-Efficient Real-World Machine Learning. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/31916.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.