Browsing by Department "Electrical and Computer Engineering"
Results Per Page
Sort Options
Item Open Access 3D Microwave Imaging through Full Wave Methods for Heterogenous Media(2011) Yuan, MengqingIn this thesis, a 3D microwave imaging method is developed for a microwave imaging system with an arbitrary background medium. In the previous study on the breast cancer detection of our research group, a full wave inverse method, the Diagonal Tensor approximation combined with Born Iterative Method (DTA-BIM), was proposed to reconstruct the electrical profile of the inversion domain in a homogenous background medium and a layered background medium. In order to evaluate the performance of the DTA-BIM method in a realistic microwave imaging system, an experimental prototype of an active 3D microwave imaging system with movable antennas is constructed. For the objects immersed in a homogenous background medium or a layered background medium, the inversion results based on the experimental data show that the resolution of the DTA-BIM method can reach finely to a quarter of wavelength of the background medium, and the system's signal-noise-ratio (SNR) requirement is 10 dB. Moreover, the defects of this system make it difficult to be implemented in a realistic application. Thus, another active 3D microwave imaging system is proposed to overcome the problems in the previous system. The new system employs a fix patch antenna array with electric switch to record the data. However, the antenna array makes the inversion system become a non-canonical inhomogeneous background. The analytical Greens' functions used in the original DTA-BIM method become unavailable. Thus, a modified DTA-BIM method, which use the numerical Green's functions combined with measured voltage, is proposed. This modified DTA-BIM method can be used to the inversion in a non-canonical inhomogeneous background with the measured voltages (or $S_{21}$ parameters). In order to verify the performance of this proposed inversion method, we investigate a prototype 3D microwave imaging system with a fix antenna array. The inversion results from the synthetic data show that this method works well with a fix antenna array, and the resolution of reconstructed images can reach to a quarter wavelength even in the presence of a strongly inhomogeneous background medium and antenna couplings. A time-reversal method is introduced as a pre-processing step to reduce the region of interest (ROI) in our inversion. In addition, a Multi-Domain DTA-BIM method is proposed to fit the discontinue inversion regions. With these improvements, the size of the inversion domain and the computational cost can be significantly reduced, and make the DTA-BIM method more feasible for rapid response applications.
Item Open Access A 3D Active Microwave Imaging System for Breast Cancer Screening(2008-12-11) Stang, JohnA 3D microwave imaging system suitable for clinical trials has been developed. The anatomy, histology, and pathology of breast cancer were all carefully considered in the development of this system. The central component of this system is a breast imaging chamber with an integrated 3D antenna array containing 36 custom designed bowtie patch antennas that radiate efficiently into human breast tissue. 3D full-wave finite element method models of this imaging chamber, complete with full antenna geometry, have been developed using Ansoft HFSS and verified experimentally. In addition, an electronic switching system using Gallium Arsenide (GaAs) absorptive RF multiplexer chips, a custom hardware control system with a parallel port interface utilizing TTL logic, and a custom software package with graphical user interface using Java and LabVIEW have all been developed. Finally, modeling of the breast (both healthy and malignant) was done using published data of the dielectric properties of human tissue, confirming the feasibility of cancer detection using this system.
Item Open Access A CG-FFT Based Fast Full Wave Imaging Method and its Potential Industrial Applications(2015) Yu, ZhiruThis dissertation focuses on a FFT based forward EM solver and its application in inverse problems. The main contributions of this work are two folded. On the one hand, it presents the first scaled lab experiment system in the oil and gas industry for through casing hydraulic fracture evaluation. This system is established to validate the feasibility of contrasts enhanced fractures evaluation. On the other hand, this work proposes a FFT based VIE solver for hydraulic fracture evaluation. This efficient solver is needed for numerical analysis of such problem. The solver is then generalized to accommodate scattering simulations for anisotropic inhomogeneous magnetodielectric objects. The inverse problem on anisotropic objects are also studied.
Before going into details of specific applications, some background knowledge is presented. This dissertation starts with an introduction to inverse problems. Then algorithms for forward and inverse problems are discussed. The discussion on forward problem focuses on the VIE formulation and a frequency domain solver. Discussion on inverse problems focuses on iterative methods.
The rest of the dissertation is organized by the two categories of inverse problems, namely the inverse source problem and the inverse scattering problem.
The inverse source problem is studied via an application in microelectronics. In this application, a FFT based inverse source solver is applied to process near field data obtained by near field scanners. Examples show that, with the help of this inverse source solver, the resolution of unknown current source images on a device under test is greatly improved. Due to the improvement in resolution, more flexibility is given to the near field scan system.
Both the forward and inverse solver for inverse scattering problems are studied in detail. As a forward solver for inverse scattering problems, a fast FFT based method for solving VIE of magnetodielectric objects with large electromagnetic contrasts are presented due to the increasing interest in contrasts enhanced full wave EM imaging. This newly developed VIE solver assigns different basis functions of different orders to expand flux densities and vector potentials. Thus, it is called the mixed ordered BCGS-FFT method. The mixed order BCGS-FFT method maintains benefits of high order basis functions for VIE while keeping correct boundary conditions for flux densities and vector potentials. Examples show that this method has an excellent performance on both isotropic and anisotropic objects with high contrasts. Examples also verify that this method is valid in both high and low frequencies. Based on the mixed order BCGS-FFT method, an inverse scattering solver for anisotropic objects is studied. The inverse solver is formulated and solved by the variational born iterative method. An example given in this section shows a successful inversion on an anisotropic magnetodielectric object.
Finally, a lab scale hydraulic fractures evaluation system for oil/gas reservoir based on previous discussed inverse solver is presented. This system has been setup to verify the numerical results obtained from previously described inverse solvers. These scaled experiments verify the accuracy of the forward solver as well as the performance of the inverse solver. Examples show that the inverse scattering model is able to evaluate contrasts enhanced hydraulic fractures in a shale formation. Furthermore, this system, for the first time in the oil and gas industry, verifies that hydraulic fractures can be imaged through a metallic casing.
Item Open Access A Compact Cryogenic Package Approach to Ion Trap Quantum Computing(2022) Spivey, Robert FultonIon traps are a leading candidate for scaling quantum computers. The component technologies can be difficult to integrate and manufacture. Experimental systems are also subject to mechanical drift creating a large maintenance overhead. A full system redesign with stability and scalability in mind is presented. The center of our approach is a compact cryogenic ion trap package (trap cryopackage). A surface trap is mounted to a modified ceramic pin grid array (CPGA) this is enclosed using a copper lid. The differentially pumped trap cryopackage has all necessary optical feedthroughs and an ion source (ablation target). The lid pressure is held at ultra-high vacuum (UHV) by cryogenic sorption pumping using carbon getter. We install this cryopackage into a commercial low-vibration closed-cycle cryostat which sits inside a custom monolithic enclosure. The system is tested and trapped ions are found to have common mode heating rate on the order of 10 quanta/s. The modular optical setup provides for a couterpropagating single qubit coherence time of 527 ms. We survey a population of FM two-qubit gates (gate times 120 μs - 450 μs) and find an average gate fidelity of 98\%. We study the gate survey with quantum Monte Carlo simulation and find that our two-qubit gate fidelity is limited by low frequency (30 Hz - 3 kHz) coherent electrical noise on our motional modes.
Item Open Access A Data-Intensive Framework for Analyzing Dynamic Supreme Court Behavior(2012) Calloway, Timothy JosephMany law professors and scholars think of the Supreme Court as a black box--issues and arguments go in to the Court, and decisions come out. The almost mystical nature that these researchers impute to the Court seems to be a function of the lack of hard data and statistics about the Court's decisions. Without a robust dataset from which to draw proper conclusions, legal scholars are often left only with intuition and conjecture.
Explaining the inner workings of one of the most important institutions in the United States using such a subjective approach is obviously flawed. And, indeed, data is available that can provide researchers with a better understanding of the Court's actions, but scholars have been slow in adopting a methodology based on data and statistical analysis. The sheer quantity of available data is overwhelming and might provide one reason why such an analysis has not yet been undertaken.
Relevant data for these studies is available from a variety of sources, but two in particular are of note. First, legal database provider LexisNexis provides a huge amount of information about how the Court's opinions are treated by subsequent opinions; thus, if the Court later overrules one of its earlier opinions, that information is captured by LexisNexis. Second, researchers at Washington University in St. Louis have compiled a database that provides detailed information about each Supreme Court decision. Combining these two sources into a coherent database will provide a treasure trove of results for future researchers to study, use, and build upon.
This thesis will explore a first-of-its-kind attempt to parse these massive datasets to provide a powerful tool for future researchers. It will also provide a window to help the average citizen understand Supreme Court behavior more clearly. By utilizing traditional data extraction and dataset analysis methods, many informative conclusions can be reached to help explain why the Court acts the way it does. For example, the results show that decisions decided by a narrow margin (i.e., by a 5 to 4 vote) are almost 4x more likely to be overruled than unanimous decisions by the Court. Many more results like these can be synthesized from the dataset and will be presented in this thesis. Possibly of higher importance, this thesis presents a framework to predict the outcomes of future and pending Supreme Court cases using statistical analysis of the data gleaned from the dataset.
In the end, this thesis strives to provide input data as well as results data for future researchers to use in studying Supreme Court behavior. It also provides a framework that researchers can use to analyze the input data to create even more results data.
Item Open Access A Formal Framework for Designing Verifiable Protocols(2017) Matthews, OpeoluwaProtocols play critical roles in computer systems today, including managing resources, facilitating communication, and coordinating actions of components. It is highly desirable to formally verify protocols, to provide a mathematical guarantee that they behave correctly. Ideally, one would pass a model of a protocol into a formal verification tool, push a button, and the tool uncovers bugs or certifies that the protocol behaves correctly. Unfortunately, as a result of the state explosion problem, automated formal verification tools struggle to verify the increasingly complex protocols that appear in computer systems today.
We observe that design decisions have a significant impact on the scalability of verifying a protocol in a formal verification tool. Hence, we present a formal framework that guides architects in designing protocols specifically to be verifiable with state of the art formal verification tools. If architects design protocols to fit our framework, the protocols inherit scalable automated verification. Key to our framework is a modular approach to constructing protocols from pre-verified subprotocols. We formulate properties that can be proven in automated tools to hold on these subprotocols, guaranteeing that any arbitrary composition of the subprotocols behaves correctly. The result is that we can design complex hierarchical (tree) protocols that are formally verified, using fully automated tools, for any number of nodes or any configuration of the tree. Our framework is applicable to a large class of protocols, including power management, cache coherence, and distributed lock management protocols.
To demonstrate the efficacy of our framework, we design and verify a realistic hierarchical (tree) coherence protocol, using a fully automated tool to prove that it behaves correctly for any configuration of the tree. We identify certain protocol optimizations prohibited by our framework or the state of the art verification tools and we evaluate our verified protocol against unverifiable protocols that feature these optimizations. We find that these optimizations have a negligible impact on performance. We hope that our framework can be used to design a wide variety of protocols that are verifiable, high-performing, and architecturally flexible.
Item Open Access A Hybrid Spectral-Element / Finite-Element Time-Domain Method for Multiscale Electromagnetic Simulations(2010) Chen, JiefuIn this study we propose a fast hybrid spectral-element time-domain (SETD) / finite-element time-domain (FETD) method for transient analysis of multiscale electromagnetic problems, where electrically fine structures with details much smaller than a typical wavelength and electrically coarse structures comparable to or larger than a typical wavelength coexist.
Simulations of multiscale electromagnetic problems, such as electromagnetic interference (EMI), electromagnetic compatibility (EMC), and electronic packaging, can be very challenging for conventional numerical methods. In terms of spatial discretization, conventional methods use a single mesh for the whole structure, thus a high discretization density required to capture the geometric characteristics of electrically fine structures will inevitably lead to a large number of wasted unknowns in the electrically coarse parts. This issue will become especially severe for orthogonal grids used by the popular finite-difference time-domain (FDTD) method. In terms of temporal integration, dense meshes in electrically fine domains will make the time step size extremely small for numerical methods with explicit time-stepping schemes. Implicit schemes can surpass stability criterion limited by the Courant-Friedrichs-Levy (CFL) condition. However, due to the large system matrices generated by conventional methods, it is almost impossible to employ implicit schemes to the whole structure for time-stepping.
To address these challenges, we propose an efficient hybrid SETD/FETD method for transient electromagnetic simulations by taking advantages of the strengths of these two methods while avoiding their weaknesses in multiscale problems. More specifically, a multiscale structure is divided into several subdomains based on the electrical size of each part, and a hybrid spectral-element / finite-element scheme is proposed for spatial discretization. The hexahedron-based spectral elements with higher interpolation degrees are efficient in modeling electrically coarse structures, and the tetrahedron-based finite elements with lower interpolation degrees are flexible in discretizing electrically fine structures with complex shapes. A non-spurious finite element method (FEM) as well as a non-spurious spectral element method (SEM) is proposed to make the hybrid SEM/FEM discretization work. For time integration we employ hybrid implicit / explicit (IMEX) time-stepping schemes, where explicit schemes are used for electrically coarse subdomains discretized by coarse spectral element meshes, and implicit schemes are used to overcome the CFL limit for electrically fine subdomains discretized by dense finite element meshes. Numerical examples show that the proposed hybrid SETD/FETD method is free of spurious modes, is flexible in discretizing sophisticated structure, and is more efficient than conventional methods for multiscale electromagnetic simulations.
Item Open Access A Molecular-scale Programmable Stochastic Process Based On Resonance Energy Transfer Networks: Modeling And Applications(2016) Wang, SiyangWhile molecular and cellular processes are often modeled as stochastic processes, such as Brownian motion, chemical reaction networks and gene regulatory networks, there are few attempts to program a molecular-scale process to physically implement stochastic processes. DNA has been used as a substrate for programming molecular interactions, but its applications are restricted to deterministic functions and unfavorable properties such as slow processing, thermal annealing, aqueous solvents and difficult readout limit them to proof-of-concept purposes. To date, whether there exists a molecular process that can be programmed to implement stochastic processes for practical applications remains unknown.
In this dissertation, a fully specified Resonance Energy Transfer (RET) network between chromophores is accurately fabricated via DNA self-assembly, and the exciton dynamics in the RET network physically implement a stochastic process, specifically a continuous-time Markov chain (CTMC), which has a direct mapping to the physical geometry of the chromophore network. Excited by a light source, a RET network generates random samples in the temporal domain in the form of fluorescence photons which can be detected by a photon detector. The intrinsic sampling distribution of a RET network is derived as a phase-type distribution configured by its CTMC model. The conclusion is that the exciton dynamics in a RET network implement a general and important class of stochastic processes that can be directly and accurately programmed and used for practical applications of photonics and optoelectronics. Different approaches to using RET networks exist with vast potential applications. As an entropy source that can directly generate samples from virtually arbitrary distributions, RET networks can benefit applications that rely on generating random samples such as 1) fluorescent taggants and 2) stochastic computing.
By using RET networks between chromophores to implement fluorescent taggants with temporally coded signatures, the taggant design is not constrained by resolvable dyes and has a significantly larger coding capacity than spectrally or lifetime coded fluorescent taggants. Meanwhile, the taggant detection process becomes highly efficient, and the Maximum Likelihood Estimation (MLE) based taggant identification guarantees high accuracy even with only a few hundred detected photons.
Meanwhile, RET-based sampling units (RSU) can be constructed to accelerate probabilistic algorithms for wide applications in machine learning and data analytics. Because probabilistic algorithms often rely on iteratively sampling from parameterized distributions, they can be inefficient in practice on the deterministic hardware traditional computers use, especially for high-dimensional and complex problems. As an efficient universal sampling unit, the proposed RSU can be integrated into a processor / GPU as specialized functional units or organized as a discrete accelerator to bring substantial speedups and power savings.
Item Open Access A Semi-Empirical Monte Carlo Method of Organic Photovoltaic Device Performance in Resonant, Infrared, Matrix-Assisted Pulsed Laser Evaporation (RIR-MAPLE) Films(2015) Atewologun, AyomideUtilizing the power of Monte Carlo simulations, a novel, semi-empirical method for investigating the performance of organic photovoltaics (OPVs) in resonant infrared, matrix-assisted pulsed laser evaporation (RIR-MAPLE) films is explored. Emulsion-based RIR-MAPLE offers a unique and powerful alternative to solution processing in depositing organic materials for use in solar cells: in particular, its usefulness in controlling the nanoscale morphology of organic thin films and the potential for creating novel hetero-structures make it a suitable experimental backdrop for investigating trends through simulation and gaining a better understanding of how different thin film characteristics impact OPV device performance.
The work presented in this dissertation explores the creation of a simulation tool that relies heavily on measureable properties of RIR-MAPLE films that impact efficiency and can be used to inform film deposition and dictate the paths for future improvements in OPV devices. The original nanoscale implementation of the Monte Carlo method for investigating OPV performance is transformed to enable direct comparison between simulation and experimental external quantum efficiency results. Next, a unique microscale formulation of the Dynamic Monte Carlo (DMC) model is developed based on the observable, fundamental differences between the morphologies of RIR-MAPLE and solution-processed bulk heterojunction (BHJ) films. This microscale model enables us to examine the sensitivity of device performance to various structural and electronic properties of the devices. Specifically, using confocal microscopy, we obtain an average microscale feature size for the RIR-MAPLE P3HT:PC61BM (1:1) BHJ system that represents a strategic starting point for utilizing the DMC as an empirical tool.
Building on this, the RIR-MAPLE P3HT:PC61BM OPV system is studied using input simulation parameters obtained from films with different material ratios and overall device structures based on characterization techniques such as grazing incidence-wide angle X-ray scattering (GI-WAXS) and X-ray photoelectron spectroscopy (XPS). The results from the microscale DMC simulation compare favorably to experimental data and allow us to articulate a well-informed critique on the strengths and limitations of the model as a predictive tool. The DMC is then used to analyze a different RIR-MAPLE BHJ system: PCPDTBT:PC71BM, where the deposition technique itself is investigated for differences in the primary solvents used during film deposition.
Finally, a multi-scale DMC model is introduced where morphology measurements taken at two different size scales, as well as structural and electrical characterization, provide a template that mimics the operation of OPVs. This final, semi-empirical tool presents a unique simulation opportunity for exploring the different properties of RIR-MAPLE deposited OPVs, their effects on OPV performance and potential design routes for improving device efficiencies.
Item Open Access A Serial Bitstream Processor for Smart Sensor Systems(2010) Cai, XinA full custom integrated circuit design of a serial bitstream processor is proposed for remote smart sensor systems. This dissertation describes details of the architectural exploration, circuit implementation, algorithm simulation, and testing results. The design is fabricated and demonstrated to be a successful working processor for basic algorithm functions. In addition, the energy performance of the processor, in terms of energy per operation, is evaluated. Compared to the multi-bit sensor processor, the proposed sensor processor provides improved energy efficiency for serial sensor data processing tasks, and also features low transistor count and area reduction advantages.
Operating in long-term, low data rate sensing environments, the serial bitstream processor developed is targeted at low-cost smart sensor systems with serial I/O communication through wireless links. This processor is an attractive option because of its low transistor count, easy on-chip integration, and programming flexibility for low data duty cycle smart sensor systems, where longer battery life, long-term monitoring and sensor reliability are critical.
The processor can be programmed for sensor processing algorithms such as delta sigma processor, calibration, and self-test algorithms. It also can be modified to utilize Coordinate Rotation Digital Computer (CORDIC) algorithms. The applications of the proposed sensor processor include wearable or portable biomedical sensors for health care monitoring or autonomous environmental sensors.
Item Open Access A Study of Field Emission Based Microfabricated Devices(2008-04-25) Natarajan, SrividyaThe primary goals of this study were to demonstrate and fully characterize a microscale ionization source (i.e. micro-ion source) and to determine the validity of impact ionization theory for microscale devices and pressures up to 100 mTorr. The field emission properties of carbon nanotubes (CNTs) along with Micro-Electro-Mechanical Systems (MEMS) design processes were used to achieve these goals. Microwave Plasma-enhanced CVD was used to grow vertically aligned Multi-Walled Carbon Nanotubes (MWNTs) on the microscale devices. A 4-dimensional parametric study focusing on CNT growth parameters confirmed that Fe catalyst thickness had a strong effect on MWNT diameter. The MWNT growth rate was seen to be a strong function of the methane-to-ammonia gas ratio during MWNT growth. A high methane-to-ammonia gas ratio was selected for MWNT growth on the MEMS devices in order to minimize growth time and ensure that the thermal budget of those devices was met.
A CNT-enabled microtriode device was fabricated and characterized. A new aspect of this device was the inclusion of a 10 micron-thick silicon dioxide electrical isolation layer. This thick oxide layer enabled anode current saturation and performance improvements such as an increase in dc amplification factor from 27 to 600. The same 3-panel device was also used as an ionization source. Ion currents were measured in the 3-panel micro-ion source for helium, argon, nitrogen and xenon in the 0.1 to 100 mTorr pressure range. A linear increase in ion current was observed for an increase in pressure. However, simulations indicated that the 3-panel design could be modified to improve performance as well as better understand device behavior. Thus, simulations and literature reports on electron impact ionization sources were used to design a new 4-panel micro-ion source. The 4-panel micro-ion source showed an approximate 10-fold performance improvement compared to the 3-panel ion source device. The improvement was attributed to the increased electron current and improved ion collection efficiency of the 4-panel device. Further, the same device was also operated in a 3-panel mode and showed superior performance compared to the original 3-panel device, mainly because of increased ion collection efficiency.
The effect of voltages applied to the different electrodes in the 4-panel micro-ion source on ion source performance was studied to better understand device behavior. The validity of the ion current equation (which was developed for macroscale ion sources operating at low pressures) in the 4-panel micro-ion source was studied. Experimental ion currents were measured for helium, argon and xenon in the 3 to 100 mTorr pressure range. For comparison, theoretical ion currents were calculated using the ion current equation for the 4-panel micro-ion source utilizing values calculated from SIMION simulations and measured electron currents. The measured ion current values in the 3 to 20 mTorr pressure range followed the calculated ion currents quite closely. A significant deviation was observed in the 20-100 mTorr pressure range. The experimental ion current values were used to develop a corrected empirical model for the 4-panel micro-ion source in this high pressure range (i.e., 3 to 100 mTorr). The role of secondary electrons and electron path lengths at higher pressures is discussed.
Item Open Access Accelerated Motion Planning Through Hardware/Software Co-Design(2019) Murray, SeanRobotics has the potential to dramatically change society over the next decade. Technology has matured such that modern robots can execute complex motions with sub-millimeter precision. Advances in sensing technology have driven down the price of depth cameras and increased their performance. However, the planning algorithms used in currently-deployed systems are too slow to react to changing environments; this has restricted the use of high degree-of-freedom (DOF) robots to tightly-controlled environments where planning in real time is not necessary.
Our work focuses on overcoming this challenge through careful hardware/software co-design. We leverage aggressive precomputation and parallelism to design accelerators for several components of the motion planning problem. We present architectures for accelerating collision detection as well as path search. We show how we can maintain flexibility even with custom hardware, and describe microarchitectures that we have implemented at the register-transfer level. We also show how to generate effective planning roadmaps for use with our designs.
Our accelerators bring the total planning latency to less than 3 microseconds, several orders of magnitude faster than the state of the art. This capability makes it possible to deploy systems that plan under uncertainty, use complex decision making algorithms, or plan for multiple robots in a workspace. We hope this technology will push robotics into domains and applications that were previously infeasible.
Item Open Access Accelerated Sepsis Diagnosis by Seamless Integration of Nucleic Acid Purification and Detection(2014) Hsu, BangNingBackground The diagnosis of sepsis is challenging because the infection can be caused by more than 50 species of pathogens that might exist in the bloodstream in very low concentrations, e.g., less than 1 colony-forming unit/ml. As a result, among the current sepsis diagnostic methods there is an unsatisfactory trade-off between the assay time and the specificity of the derived diagnostic information. Although the present qPCR-based test is more specific than biomarker detection and faster than culturing, its 6 ~ 10 hr turnaround remains suboptimal relative to the 7.6%/hr rapid deterioration of the survival rate, and the 3 hr hands-on time is labor-intensive. To address these issues, this work aims to utilize the advances in microfluidic technologies to expedite and automate the ``nucleic acid purification - qPCR sequence detection'' workflow.
Methods and Results This task is evaluated to be best approached by combining immiscible phase filtration (IPF) and digital microfluidic droplet actuation (DM) on a fluidic device. In IPF, as nucleic acid-bound magnetic beads are transported from an aqueous phase to an immiscible phase, the carryover of aqueous contaminants is minimized by the high interfacial tension. Thus, unlike a conventional bead-based assay, the necessary degree of purification can be attained in a few wash steps. After IPF reduces the sample volume from a milliliter-sized lysate to a microliter-sized eluent, DM can be used to automatically prepare the PCR mixture. This begins with compartmenting the eluent in accordance with the desired number of multiplex qPCR reactions, and then transporting droplets of the PCR reagents to mix with the eluent droplets. Under the outlined approach, the IPF - DM integration should lead to a notably reduced turnaround and a hands-free ``lysate-to-answer'' operation.
As the first step towards such a diagnostic device, the primary objective of this thesis is to verify the feasibility of the IPF - DM integration. This is achieved in four phases. First, the suitable assays, fluidic device, and auxiliary systems are developed. Second, the extent of purification obtained per IPF wash, and hence the number of washes needed for uninhibited qPCR, are estimated via off-chip UV absorbance measurement and on-chip qPCR. Third, the performance of on-chip qPCR, particularly the copy number - threshold cycle correlation, is characterized. Lastly, the above developments accumulate to an experiment that includes the following on-chip steps: DNA purification by IPF, PCR mixture preparation via DM, and target quantification using qPCR - thereby demonstrating the core procedures in the proposed approach.
Conclusions It is proposed to expedite and automate qPCR-based multiplex sparse pathogen detection by combining IPF and DM on a fluidic device. As a start, this work demonstrated the feasibility of the IPF - DM integration. However, a more thermally robust device structure will be needed for later quantitative investigations, e.g., improving the bead - buffer mixing. Importantly, evidences indicate that future iterations of the IPF - DM fluidic device could reduce the sample-to-answer time by 75% to 1.5 hr and decrease the hands-on time by 90% to approximately 20 min.
Item Open Access Accelerating Probabilistic Computing with a Stochastic Processing Unit(2020) Zhang, XiangyuStatistical machine learning becomes a more important workload for computing systems than ever before. Probabilistic computing is a popular approach in statistical machine learning, which solves problems by iteratively generating samples from parameterized distributions. As an alternative to Deep Neural Networks, probabilistic computing provides conceptually simple, compositional, and interpretable models. However, probabilistic algorithms are often considered too slow on the conventional processors due to sampling overhead to 1) computing the parameters of a distribution and 2) generating samples from the parameterized distribution. A specialized architecture is needed to address both the above aspects.
In this dissertation, we claim a specialized architecture is necessary and feasible to efficiently support various probabilistic computing problems in statistical machine learning, while providing high-quality and robust results.
We start with exploring a probabilistic architecture to accelerate Markov Random Field (MRF) Gibbs Sampling by utilizing the quantum randomness of optical-molecular devices---Resonance Energy Transfer (RET) networks. We provide a macro-scale prototype, the first such system to our knowledge, to experimentally demonstrate the capability of RET devices to parameterize a distribution and run a real application. By doing a quantitative result quality analysis, we further reveal the design issues of an existing RET-based probabilistic computing unit (1st-gen RSU-G) that lead to unsatisfactory result quality in some applications. By exploring the design space, we propose a new RSU-G microarchitecture that empirically achieves the same result quality as 64-bit floating-point software, with the same area and modest power overheads compared with 1st-gen RSU-G. An efficient stochastic probabilistic unit can be fulfilled using RET devices.
The RSU-G provides high-quality true Random Number Generation (RNG). We further explore how quality of an RNG is related to application end-point result quality. Unexpectedly, we discover the target applications do not necessarily require high-quality RNGs---a simple 19-bit Linear-Feedback Shift Register (LFSR) does not degrade end-point result quality in the tested applications. Therefore, we propose a Stochastic Processing Unit (SPU) with a simple pseudo RNG that achieves equivalent function to RSU-G but maintains the benefit of a CMOS digital circuit.
The above results bring up a subsequent question: are we confident to use a probabilistic accelerator with various approximation techniques, even though the end-point result quality ("accuracy") is good in tested benchmarks? We found current methodologies for evaluating correctness of probabilistic accelerators are often incomplete, mostly focusing only on end-point result quality ("accuracy") but omitting other important statistical properties. Therefore, we claim a probabilistic architecture should provide some measure (or guarantee) of statistical robustness. We take a first step toward defining metrics and a methodology for quantitatively evaluating correctness of probabilistic accelerators. We propose three pillars of statistical robustness: 1) sampling quality, 2) convergence diagnostic, and 3) goodness of fit. We apply our framework to a representative MCMC accelerator (SPU) and surface design issues that cannot be exposed using only application end-point result quality. Finally, we demonstrate the benefits of this framework to guide design space exploration in a case study showing that statistical robustness comparable to floating-point software can be achieved with limited precision, avoiding floating-point hardware overheads.
Item Open Access Accelerator Architectures for Deep Learning and Graph Processing(2020) Song, LinghaoDeep learning and graph processing are two big-data applications and they are widely applied in many domains. The training of deep learning is essential for inference and has not yet been fully studied. With data forward, error backward, and gradient calculation, deep learning training is a more complicated process with higher computation and communication intensity. Distributing computations on multiple heterogeneous accelerators to achieve high throughput and balanced execution, however, remaining challenging. In this dissertation, I present AccPar, a principled and systematic method of determining the tensor partition for multiple heterogeneous accelerators for efficient training acceleration. Emerging resistive random access memory (ReRAM) is promising for processing in memory (PIM). For high-throughput training acceleration in ReRAM-based PIM accelerator, I present PipeLayer, an architecture for layer-wise pipelined parallelism. Graph processing is well-known for poor locality and high memory bandwidth demand. In conventional architectures, graph processing incurs a significant amount of data movements and energy consumption. I present GraphR, the first ReRAM-based graph processing accelerator which follows the principle of near-data processing and explores the opportunity of performing massive parallel analog operations with low hardware and energy cost. Sparse matrix-vector multiplication (SpMV), a subset of graph processing, is the key computation in iterative solvers for scientific computing. The efficiently accelerating floating-point processing in ReRAM remains a challenge. In this dissertation, I present ReFloat, a data format, and a supporting accelerator architecture, for low-cost floating-point processing in ReRAM for scientific computing.
Item Open Access Accurate and Efficient Methods for the Scattering Simulation of Dielectric Objects in a Layered Medium(2019) Huang, WeifengElectromagnetic scattering in a layered medium (LM) is important for many engineering applications, including the hydrocarbon exploration. Various computational methods for tackling well logging simulations are summarized. Given their advantages and limitations, main attention is devoted to the surface integral equation (SIE) and its hybridization with the finite element method (FEM).
The thin dielectric sheet (TDS) based SIE, i.e., TDS-SIE, is introduced to the simulation of fractures. Its accuracy and efficiency are extensively demonstrated by simulating both conductive and resistive fractures. Fractures of variant apertures, conductivities, dipping angles, and extensions are also simulated and analyzed. With the aid of layered medium Green's functions (LMGFs), TDS-SIE is extended into the LM, which results in the solver entitled LM-TDS-SIE.
In order to consider the borehole effect, the well-known loop and tree basis functions are utilized to overcome low-frequency breakdown of the Poggio, Miller, Chang, Harrington, Wu, and Tsai (PMCHWT) formulation. This leads to the loop-tree (LT) enhanced PMCHWT, which can be hybridized with TDS-SIE to simulate borehole and fracture together. The resultant solver referred to as LT-TDS is further extended into the LM, which leads to the solver entitled LM-LT-TDS.
For inhomogeneous or complex structures, SIE is not suitable for their scattering simulations. It becomes advantageous to hybridize FEM with SIE in the framework of domain decomposition method (DDM), which allows independent treatment of each subdomain and nonconformal meshes between them. This hybridization can be substantially enhanced by the adoption of LMGFs and loop-tree bases, leading to the solver entitled LM-LT-DDM. In comparison with LM-LT-TDS, this solver is more powerful and able to handle more general low-frequency scattering problems in layered media.
Item Open Access Actively Tunable Plasmonic Nanostructures(2020) Wilson, Wade MitchellActive plasmonic nanostructures with tunable resonances promise to enable smart materials with multiple functionalities, on-chip spectral-based imaging and low-power optoelectronic devices. A variety of tunable materials have been integrated with plasmonic structures, however, the tuning range in the visible regime has been limited and small on/off ratios are typical for dynamically switchable devices. An all optical tuning mechanism is desirable for on-chip optical computing applications. Furthermore, plasmonic structures are traditionally fabricated on rigid substrates, restricting their application in novel environments such as in wearable technology.
In this dissertation, I explore the mechanisms behind dynamic tuning of plasmon resonances, as well as demonstrate all-optical tuning through multiple cycles by incorporating photochromic molecules into plasmonic nanopatch antennas. Exposure to ultraviolet (UV) light switches the molecules into a photoactive state enabling dynamic control with on/off ratios up to 9.2 dB and a tuning figure of merit up to 1.43, defined as the ratio between the spectral shift and the initial line width of the plasmonic resonance. Moreover, the physical mechanisms underlying the large spectral shifts are elucidated by studying over 40 individual nanoantennas with fundamental resonances from 550 to 720 nm revealing good agreement with finite-element simulations.
To fully explore the tuning capabilities, the molecules are incorporated into plasmonic metasurface absorbers based on the same geometry as the single nanoantennas. The increased interaction between film-coupled nanocubes and resonant dipoles in the photochromic molecules gives rise to strong coupling. The coupling strength can be quantified by the Rabi-splitting of the plasmon resonance at ~300 meV, well into the ultrastrong coupling regime.
Additionally, fluorescent emitters are incorporated into the tunable absorber platform to give dynamic control over their emission intensity. I use optical spectroscopy to investigate the capabilities of tunable plasmonic nanocavities coupled to dipolar photochromic molecules. By incorporating emission sources, active control over the peak photoluminescence (PL) wavelength and emission intensity is demonstrated with PL spectroscopy.
Beyond wavelength tuning of the plasmon resonance, design and characterization is performed towards the development of a pyroelectric photodetector that can be implemented on a flexible substrate, giving it the ability to be conformed to new shapes on demand. Photodetection in the NIR with responsivities up to 500 mV/W is demonstrated. A detailed plan is given for the next steps required to fully realize visible to short-wave infrared (SWIR) pyroelectric photodetection with a cost-effective, scalable fabrication process. This, in addition to real-time control over the plasmon resonance, opens new application spaces for photonic devices that integrate plasmonic nanoparticles and actively tunable materials.
Item Open Access Adaptive Brain-Computer Interface Systems For Communication in People with Severe Neuromuscular Disabilities(2016) Mainsah, Boyla OBrain-computer interfaces (BCI) have the potential to restore communication or control abilities in individuals with severe neuromuscular limitations, such as those with amyotrophic lateral sclerosis (ALS). The role of a BCI is to extract and decode relevant information that conveys a user's intent directly from brain electro-physiological signals and translate this information into executable commands to control external devices. However, the BCI decision-making process is error-prone due to noisy electro-physiological data, representing the classic problem of efficiently transmitting and receiving information via a noisy communication channel.
This research focuses on P300-based BCIs which rely predominantly on event-related potentials (ERP) that are elicited as a function of a user's uncertainty regarding stimulus events, in either an acoustic or a visual oddball recognition task. The P300-based BCI system enables users to communicate messages from a set of choices by selecting a target character or icon that conveys a desired intent or action. P300-based BCIs have been widely researched as a communication alternative, especially in individuals with ALS who represent a target BCI user population. For the P300-based BCI, repeated data measurements are required to enhance the low signal-to-noise ratio of the elicited ERPs embedded in electroencephalography (EEG) data, in order to improve the accuracy of the target character estimation process. As a result, BCIs have relatively slower speeds when compared to other commercial assistive communication devices, and this limits BCI adoption by their target user population. The goal of this research is to develop algorithms that take into account the physical limitations of the target BCI population to improve the efficiency of ERP-based spellers for real-world communication.
In this work, it is hypothesised that building adaptive capabilities into the BCI framework can potentially give the BCI system the flexibility to improve performance by adjusting system parameters in response to changing user inputs. The research in this work addresses three potential areas for improvement within the P300 speller framework: information optimisation, target character estimation and error correction. The visual interface and its operation control the method by which the ERPs are elicited through the presentation of stimulus events. The parameters of the stimulus presentation paradigm can be modified to modulate and enhance the elicited ERPs. A new stimulus presentation paradigm is developed in order to maximise the information content that is presented to the user by tuning stimulus paradigm parameters to positively affect performance. Internally, the BCI system determines the amount of data to collect and the method by which these data are processed to estimate the user's target character. Algorithms that exploit language information are developed to enhance the target character estimation process and to correct erroneous BCI selections. In addition, a new model-based method to predict BCI performance is developed, an approach which is independent of stimulus presentation paradigm and accounts for dynamic data collection. The studies presented in this work provide evidence that the proposed methods for incorporating adaptive strategies in the three areas have the potential to significantly improve BCI communication rates, and the proposed method for predicting BCI performance provides a reliable means to pre-assess BCI performance without extensive online testing.
Item Open Access Adaptive Discontinuous Galerkin Methods Applied to Multiscale & Multiphysics Problems towards Large-scale Modeling & Joint Imaging(2019) Zhan, QiweiAdvanced numerical algorithms should be amenable to the scalability in the increasingly powerful supercomputer architectures, the adaptivity in the intricately multi-scale engineering problems, the efficiency in the extremely large-scale wave simulations, and the stability in the dynamically multi-phase coupling interfaces.
In this study, I will present a multi-scale \& multi-physics 3D wave propagation simulator to tackle these grand scientific challenges. This simulator is based on a unified high-order discontinuous Galerkin (DG) method, with adaptive nonconformal meshes, for efficient wave propagation modeling. This algorithm is compatible with a diverse portfolio of real-world geophysical/biomedical applications, ranging from longstanding tough problems: such as arbitrary anisotropic elastic/electromagnetic materials, viscoelastic materials, poroelastic materials, piezoelectric materials, and fluid-solid coupling, to recent challenging topics: such as fracture-wave interactions.
Meanwhile, I will also present some important theoretical improvements. Especially, I will show innovative Riemann solvers, inspired by physical meanings, in a unified mathematical framework, which are the key to guaranteeing the stability and accuracy of the DG methods and domain decomposition methods.
Item Open Access Adaptive Methods for Machine Learning-Based Testing of Integrated Circuits and Boards(2020) Liu, MengyunThe relentless growth in information technology and artificial intelligence (AI) is placing demands on integrated circuits and boards for high performance, added functionality, and low power consumption. As a result, design complexity and integration continue to increase, and emerging devices are being explored. However, these new trends lead to high test cost and challenges associated with semiconductor test.
Machine learning has emerged as a powerful enabler in various application domains, and it provides an opportunity to overcome the challenges associated with expert-based test. Taking the advantages of powerful machine-learning techniques, useful information can be extracted from history testing data, and this information helps facilitate the testing process for both chips and boards.
Moreover, to attain test cost reduction with no test quality degradation, adaptive methods for testing are now being advocated. In conventional testing methods, variations among different chips and different boards are ignored. As a result, the same test items are applied to all chips; online testing is carried out after every fixed interval; immutable fault-diagnosis models are used for all boards. In contrast, adaptive methods observe changes in the distribution of testing data and dynamically adjust the testing process, and hence reduce the test cost. In this dissertation, we study solutions for both chip-level test and board-level test. Our objective is to design the most proper solutions for adapting machine-learning techniques to testing area.
For chip-level test, the dissertation first presents machine learning-based adaptive testing to drop unnecessary test items and reduce the test cost in high-volume chip manufacturing. The proposed testing framework uses the parametric test results from circuit probing test to train a quality-prediction model, partitions chips into different groups based on the predicted quality, and selects the different important test items for each group of chips. To achieve the same defect level as in prior work on adaptive testing, the proposed fine-grained adaptive testing method significantly reduces test cost.
Besides CMOS-based chips, emerging devices (e.g., resistive random access memory (ReRAM)) are being explored to implement AI chips with high energy efficiency. Due to the immature fabrication process, ReRAMs are vulnerable to dynamic faults. Instead of periodically interrupting the computing process and carrying out the testing process, the dissertation presents an efficient method to detect the occurrence of dynamic faults in ReRAM crossbars. This method monitors an indirect measure of the dynamic power consumption of each ReRAM crossbar, determines the occurrence of faults when a changepoint is detected in the monitored power-consumption time series. This model also estimates the percentage of faulty cells in a ReRAM crossbar by training a machine learning-based predictive model. In this way, the time-consuming fault localization and error recovery steps are only carried out when a high defect rate is estimated, and hence the test time is considerably reduced.
For board-level test, the cost associated with the diagnosis and repair due to board-level failures is one of the highest contributors to board manufacturing cost. To reduce the cost associated with fault diagnosis, a machine learning-based diagnosis workflow has been developed to support board-level functional fault identification in the dissertation. In a production environment, the large volume of manufacturing data comes in a streaming format and may exhibit a time-dependent concept drift. In order to process streaming data and adapt to concept drifts, instead of using an immutable diagnosis model, this dissertation also presents the method that uses an online learning algorithm to incrementally update the identification model. Experimental results show that, with the help of online learning, the diagnosis accuracy is improved, and the training time is significantly reduced.
The machine learning-based diagnosis workflow can identify board-level functional faults with high accuracy. However, the prediction accuracy is low when a new board has a limited amount of fail data and repair records. The dissertation presents a diagnosis system that can utilize domain-adaptation algorithms to transfer the knowledge learned from a mature board to a new board. Domain adaptation significantly reduces the requirement for the number of repair records from the new board, while achieving a relatively high diagnostic accuracy in the early stage of manufacturing a new product. The proposed domain adaptation workflow designs a metric to evaluate the similarity between two types of boards. Based on the calculated similarity value, different domain-adaptation algorithms are selected to transfer knowledge and train a diagnosis model.
In summary, this dissertation tackles important problems related to the testing of integrated circuits and boards. By considering variations among different chips or boards, machine learning-based adaptive methods enable the reduction of test cost. The proposed machine learning-based testing methods are expected to contribute to quality assurance and manufacturing-cost reduction in the semiconductor industry.