Browsing by Subject "RRAM"
- Results Per Page
- Sort Options
Item Open Access FAULT MODELING, DESIGN-FOR-TEST, AND FAULT TOLERANCE FOR MACHINE LEARNING HARDWARE(2022) Chaudhuri, ArjunThe ubiquitous application of DNNs has led to a rise in demand for custom artificial intelligence (AI) accelerators. Domain-specific AI accelerators for machine-learning inferencing applications are homogeneous designs composed of thousands of identical compute cores, or processing elements (PEs), that interface with the on-chip memory (such as local and global buffers). Accelerators can be classified on the basis of two major use-cases: training and inferencing. Inferencing is carried out by using AI accelerators on edge devices as well as in datacenters. They are being deployed for inferencing in autonomous driving, manufacturing automation, and navigation. Many such use-cases require high reliability. However, DNN inferencing applications are inherently fault-tolerant with respect to structural faults in the hardware; it has been shown that many faults are not functionally critical, i.e., they do not lead to any significant error in inferencing. As a result, testing for all faults in an accelerator chip is an "over-kill". Methods of functional criticality assessment need to be devised for low-cost testing of large AI chips. Moreover, testing homogeneous array-based AI accelerators by running automatic test pattern generation (ATPG) at the array level results in a high CPU time and pattern count. Current test methods do not fully exploit the regular dataflow in the accelerators. Hence, we plan on developing a "constant-testable" solution wherein a small test-pattern set is generated for one PE and reused for testing all other PEs.
Deep neural net (DNN)-driven inferencing applications such as image classification are inherently fault-tolerant with respect to structural faults; it has been shown that many faults are not functionally critical, i.e., they do not lead to any significant error in inferencing. This dissertation proposes low-cost structural and functional test methods for AI accelerators. Incorporation of the knowledge of fault criticality in testing enables the application of dedicated test effort for functionally critical faults. The dissertation utilizes supervised learning-driven DNNs, graph convolutional networks (GCNs), and neural twins of digital logic circuits to evaluate the functional criticality of faults in the gate-level netlist of an inferencing accelerator, thereby bypassing the need for computationally expensive brute-force fault simulations.
The generation of labeled data for supervised learning introduces prohibitive computation costs if the labeling process involves time-consuming simulations. For criticality analysis, a large number of fault simulations are needed to collect sufficient information about critical and benign faults. High runtime requirements for collecting sufficient labeled data become the bottleneck in supervised learning-driven fault-criticality analysis. This dissertation presents methodologies that reduce the amount of labeled and balanced data required for accurate classifier training.
Resistive-oxide random-access memory (RRAM) devices constitute a promising technology for building neuromorphic accelerator hardware due to their processing-in-memory (neuromorphic) abilities. The fundamental matrix-multiply operations in AI accelerators can be executed with reduced latency and power consumption by RRAM cells; however, they are known to suffer from high defect rates that contribute to faulty behavior. It is therefore important to analyze RRAM fault models and understand the root causes of defects and variations. In this dissertation, we present a physics-based classification of RRAM fault origins for dense RRAM crossbars---high density is a requirement for the training and inferencing of large neural networks with a high throughput. In this report, we present insights into the RRAM fault origins, which provide valuable feedback for the fabrication and design of RRAM-based accelerators. In addition to fault analysis, we need to tolerate faulty RRAM cells in a crossbar to ensure intended system operation---especially when crossbars suffer from low-to-medium defect densities and it is not economically viable to discard the entire crossbar. Although software-based fault-tolerance schemes have been proposed in the literature, more efficient fault tolerance for RRAM crossbars can be achieved through innovations in the hardware design. The dissertation presents the architecture of a novel processing element to tolerate faults in binary RRAM-based crossbars for in-memory computing.
Monolithic 3D (M3D) ICs have emerged as suitable platforms for high-density vertical integration of large system-on-chips (SoCs) like domain-specific and neuromorphic inferencing accelerators, with significant improvement in power, performance, and area (PPA) over 2D and conventional 3D-stacked ICs. However, the immature M3D fabrication process is prone to defects (especially in the inter-layer vias (ILVs)) and inter-tier process variations. In this dissertation, we present state-of-the-art low-cost built-in self-test (BIST) solutions for detecting and localizing both hard and resistive (small-delay) defects in ILVs. In addition to testing ILVs in high-density and realistic M3D layouts, tier-level fault localization is needed for yield ramp-up prior to high-volume production of M3D accelerator ICs. Due to overhead concerns, only a limited number of observation points can be inserted on the outgoing ILVs of an M3D tier for fault localization. This dissertation introduces NodeRank, an intelligent graph-theoretic algorithm, for observation-point insertion on an optimal set of outgoing ILVs in an M3D tier which lead to an increase in the diagnosability of detected faults in the M3D design.
In summary, the dissertation addresses important problems related to the functional impact of hardware faults in machine learning applications, low-cost test and diagnosis of accelerator faults, technology bring-up and fault tolerance for RRAM-based neuromorphic engines, and design-for-testability (DfT) for high-density M3D ICs. The insights and findings resulting from this dissertation are anticipated to lead to the fabrication of reliable accelerator ICs supported by low-cost DfT infrastructure.
Item Open Access Fault Tolerance for RRAM-Based Matrix OperationsLiu, Mengyun; Xia, Lixue; Wang, Yu; Chakrabarty, KAn RRAM-based computing system (RCS) provides an energy efficient hardware implementation of vector-matrix multiplication for machine-learning hardware. However, it is vulnerable to faults due to the immature RRAM fabrication process. We propose an efficient fault tolerance method for RCS; the proposed method, referred to as extended-ABFT (X-ABFT), is inspired by algorithm-based fault tolerance (ABFT). We utilize row checksums and test-input vectors to extract signatures for fault detection and error correction. We present a solution to alleviate the overflow problem caused by the limited number of voltage levels for the test-input signals. Simulation results show that for a Hopfield classifier with faults in 5% of its RRAM cells, X-ABFT allows us to achieve nearly the same classification accuracy as in the fault-free case.Item Open Access Highly Efficient Neuromorphic Computing Systems With Emerging Nonvolatile Memories(2020) Yan, BonanEmerging nonvolatile memory based hardware neuromorphic computing systems have enabled the implementation of general vector-matrix multiplication in a manner to fuse computation and memory at the same physical location. However, there remain three major challenges in designing such neuromorphic computing systems for high efficiency in a large scale integration: (a) the analog/digital interface circuits dominate the power and area in such mixed-signal designs; (b) they are highly customized and can only compute a class of neural network models once developed; (c) non-ideal device properties largely forfeit the benefit in terms of computational efficiency.
Designs of mixed-signal interface circuitry have been extensively studied, but a holistic design approach regarding very-large-scale integration is overlooked for emerging nonvolatile memory based neuromorphic computing systems involving circuit design, microarchitecture and hardware/software co-simulation. The realization of such neuromorphic computing platforms requires: (a) efficient interface circuits as well as execution models; (b) appropriate reconfigurability at runtime for different neural network architectures; and (c) reliability enhancement methods to resist imperfect fabrication and tough working environment.
Motivated by these demands, this dissertation first introduces an implementation scheme of neuromorphic computing system that uses emerging nonvolatile memory as synapses and CMOS integrated circuits as neurons. To save the energy consumption of data communication, the neuron circuits are improved upon conventional integrated and first neuron circuits for better current-to-spike conversion efficiency. Trade-offs between throughput and latency are investigated and validated by a prototype 64Kb Resistive Random Access Memory based in-memory computing processing engine.
Next, this dissertation proposes a type of fully-memristive neuromorphic computing system architecture that incorporates Mott memristor as the neuron circuit. The small footprint and intrinsic bionic dynamics of emerging memory-based neuron circuits significantly reduce design complexity. This dissertation investigates and models the randomness that Mott memristors inflict. By suppressing it during inference and exploiting it during learning, the proposed system is optimized for the balance of inference accuracy and training efficiency.
Moreover, this dissertation advances the reconfigurability of emerging memory based neuromorphic computing systems by presenting a paradigm that supports post-fabrication switching between spiking and non-spiking neural network model execution. An improved version of time-to-first-spike temporal encoding is proposed to use single spikes in accelerating the execution speed.
Finally, this dissertation presents hardware/software codesign techniques for the implementation of neuromorphic computing systems with emerging nonvolatile memories. A hardware/software co-simulation flow is developed. And based on this, this dissertation also proposes a closed-loop design to enhance the weight stability to resist the read disturbance.
In summary, the dissertation tackles important problems in designing neuromorphic computing systems with emerging nonvolatile memories. The outcome of this research is expected not only to pave the way for realizing highly efficiency artificial intelligence hardware, but also shorten the product development cycle.
Item Open Access Printing Electronic Components from Copper-Infused Ink and Thermoplastic Mediums(2017) Flowers, PatrickThe demand for printable electronics has sharply increased in recent years and is projected to continue to rise. Unfortunately, electronic materials which are suitable for desired applications while being compatible with available printing techniques are still often lacking. This thesis addresses two such challenging areas.
In the realm of two-dimensional ink-based printing of electronics, a major barrier to the realization of printable computers that can run programs is the lack of a solution-coatable non-volatile memory with performance metrics comparable to silicon-based devices. To address this deficiency, I developed a nonvolatile memory based on Cu-SiO2 core-shell nanowires that can be printed from solution and exhibits on-off ratios of 106, switching speeds of 50 ns, a low operating voltage of 2 V, and operates for at least 104 cycles without failure. Each of these metrics is similar to or better than Flash memory (the write speed is 20 times faster than Flash). Memory architectures based on the individual memory cells demonstrated here could enable the printing of the more complex, embedded computing devices that are expected to make up an internet of things.
Recently, the exploration of three-dimensional printing techniques to fabricate electronic materials began. A suitable general-purpose conductive thermoplastic filament was not available, however. In this work I examine the current state of conductive thermoplastic filaments, including a newly-released highly conductive filament that my lab has produced which we call Electrifi. I focus on the use of dual-material fused filament fabrication (FFF) to 3D print electronic components (conductive traces, resistors, capacitors, inductors) and circuits (a fully-printed high-pass filter). The resistivity of traces printed from conductive thermoplastic filaments made with carbon-black, graphene, and copper as conductive fillers was found to be 12, 0.78, and 0.014 ohm cm, respectively, enabling the creation of resistors with resistances spanning 3 orders of magnitude. The carbon black and graphene filaments were brittle and fractured easily, but the copper-based filament could be bent at least 500 times with little change in its resistance. Impedance measurements made on the thermoplastic filaments demonstrate that the copper-based filament had an impedance similar to a conductive PCB trace at 1 MHz. Dual material 3D printing was used to fabricate a variety of inductors and capacitors with properties that could be predictably tuned by modifying either the geometry of the components, or the materials used to fabricate the components. These resistors, capacitors, and inductors were combined to create a fully 3D printed high-pass filter with properties comparable to its conventional counterparts. The relatively low impedance of the copper-based filament enable its use to 3D print a receiver coil for wireless power transfer. We also demonstrate the ability to embed and connect surface mounted components in 3D printed objects with a low-cost ($1,000 in parts), open source dual-material 3D printer. This work thus demonstrates the potential for FFF 3D printing to create complex, three-dimensional circuits composed of either embedded or fully-printed electronic components.