Processing-in-Memory Accelerators Toward Energy-Efficient Real-World Machine Learning
Date
2024
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
Artificial intelligence (AI) has permeated the real world, reaping unprecedented success. Numberless applications exploit machine learning (ML) technologies of big data and compute-intensive algorithms. Moreover, the aspiration of authentic machine intelligence moves computing toward the edge to handle complex tasks conventionally tailored for human beings. Along with the rapid development, the gap between the increasing resource requirements in ML and the restricted environments of edge engenders urgent attention to the challenges in efficiency. To resolve the gap, solutions across different disciplines in hardware are necessary beyond algorithm development.
Unfortunately, hardware development falls far behind because of heterogeneity. While the sensational advance of ML algorithms is a game-change of computing paradigms, conventional hardware unfits new paradigms due to fundamental limitations in its architecture and technology. The traditional architecture separating storage and computation is dreadfully inefficient for innumerable data processing and computing in algorithms, showing high power consumption and low performance. The realization of the fundamental limitations motivates efficient and non-conventional hardware accelerators.
As a new hardware paradigm, processing-in-memory accelerators (PIM) have brought significant expectations because of their intuitive effectiveness for the limitations of traditional hardware. PIM merges computing and processing units and saves resources for data and computations, pursuing non-heterogeneity and ultimately improving efficiency.Previous PIM accelerators have shown promising outcomes with high-performance computing, particularly thanks to emerging memories under the name of memristor.
Despite its motivation for non-heterogeneity, PIM-based designs couldn't fully escape from heterogeneity causing inefficiency with high costs. While emerging memories provide revolutions at device and circuit levels, PIM at higher levels struggles with various components in systems (horizontal heterogeneity). Furthermore, PIM is holistically designed across hierarchical levels of heterogeneity (vertical heterogeneity), which complicates its design with efficiency.Even robustness could be significantly influenced by heterogeneity.
Confronting the challenges in heterogeneity, efficiency, and robustness, my research has cultivated PIM hardware through cross-layer designs for practically efficient ML acceleration. Specifically, focusing on architecture/system-level innovations, I have pioneered novel 3D architectures and systemic paradigms, which provide a strong foundation for future computing. For ML acceleration, I have proposed new methodologies to efficiently operate 3D architecture and a novel dataflow with a new 3D design for energy efficiency by pursuing non-heterogeneity. The innovations have been examined through rigorous hardware experiments, and their practical efficiency has been proved with a fabricated chip for seizure classification, a real-world application. According to the need for future ML, my research is evolving to accomplish robustness in hardware as ML platforms. In this dissertation, I summarize the research impacts based on my diverse design experiences, spanning architecture and system design to chip fabrication.
Type
Department
Description
Provenance
Citation
Permalink
Citation
Kim, Bokyung (2024). Processing-in-Memory Accelerators Toward Energy-Efficient Real-World Machine Learning. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/31916.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.