Browsing by Subject "Machine learning"
Results Per Page
Sort Options
Item Open Access A Bayesian Strategy to the 20 Question Game with Applications to Recommender Systems(2017) Suresh, Sunith RajIn this paper, we develop an algorithm that utilizes a Bayesian strategy to determine a sequence of questions to play the 20 Question game. The algorithm is motivated with an application to active recommender systems. We first develop an algorithm that constructs a sequence of questions where each question inquires only about a single binary feature. We test the performance of the algorithm utilizing simulation studies, and find that it performs relatively well under an informed prior. We modify the algorithm to construct a sequence of questions where each question inquires about 2 binary features via AND conjunction. We test the performance of the modified algorithm
via simulation studies, and find that it does not significantly improve performance.
Item Open Access A Black-Scholes-integrated Gaussian Process Model for American Option Pricing(2020-04-15) Kim, ChiwanAcknowledging the lack of option pricing models that simultaneously have high prediction power, high computational efficiency, and interpretations that abide by financial principles, we suggest a Black-Scholes-integrated Gaussian process (BSGP) learning model that is capable of making accurate predictions backed with fundamental financial principles. Most data-driven models boast strong computational power at the expense of inferential results that can be explained with financial principles. Vice versa, most closed-form stochastic models (principle-driven) exhibit inferential results at the cost of computational efficiency. By integrating the Black-Scholes computed price for an equivalent European option into the mean function of the Gaussian process, we can design a learning model that emphasizes the strengths of both data- driven and principle-driven approaches. Using American (SPY) call and put option price data from 2019 May to June, we condition the Black-Scholes mean Gaussian Process prior with observed data to derive the posterior distribution that is used to predict American option prices. Not only does the proposed BSGP model provide accurate predictions, high computational efficiency, and interpretable results, but it also captures the discrepancy between a theoretical option price approximation derived by the Black-Scholes and predicted price from the BSGP model.Item Open Access A Comprehensive Framework for Adaptive Optics Scanning Light Ophthalmoscope Image Analysis(2019) Cunefare, DavidDiagnosis, prognosis, and treatment of many ocular and neurodegenerative diseases, including achromatopsia (ACHM), require the visualization of microscopic structures in the eye. The development of adaptive optics ophthalmic imaging systems has made high resolution visualization of ocular microstructures possible. These systems include the confocal and split detector adaptive optics scanning light ophthalmoscope (AOSLO), which can visualize human cone and rod photoreceptors in vivo. However, the avalanche of data generated by such imaging systems is often too large, costly, and time consuming to be evaluated manually, making automation necessary. The few currently available automated cone photoreceptor identification methods are unable to reliably identify rods and cones in low-quality images of diseased eyes, which are common in clinical practice.
This dissertation describes the development of automated methods for the analysis of AOSLO images, specifically focusing on cone and rod photoreceptors which are the most commonly studied biomarker using these systems. A traditional image processing approach, which requires little training data and takes advantage of intuitive image features, is presented for detecting cone photoreceptors in split detector AOSLO images. The focus is then shifted to deep learning using convolutional neural networks (CNNs), which have been shown in other image processing tasks to be more adaptable and produce better results than classical image processing approaches, at the cost of requiring more training data and acting as a “black box”. A CNN based method for detecting cones is presented and validated against state-of-the-art cone detections methods for confocal and split detector images. The CNN based method is then modified to take advantage of multimodal AOSLO information in order to detect cones in images of subjects with ACHM. Finally, a significantly faster CNN based approach is developed for the classification and detection of cones and rods, and is validated on images from both healthy and pathological subjects. Additionally, several image processing and analysis works on optical coherence tomography images that were carried out during the completion of this dissertation are presented.
The completion of this dissertation led to fast and accurate image analysis tools for the quantification of biomarkers in AOSLO images pertinent to an array of retinal diseases, lessening the reliance on subjective and time-consuming manual analysis. For the first time, automatic methods have comparable accuracy to humans for quantifying photoreceptors in diseased eyes. This is an important step in the long-term goal to facilitate early diagnosis, accurate prognosis, and personalized treatment of ocular and neurodegenerative diseases through optimal visualization and quantification of microscopic structures in the eye.
Item Open Access A High-Tech Solution for the Low Resource Setting: A Tool to Support Decision Making for Patients with Traumatic Brain Injury(2019) Elahi, CyrusBackground. The confluence of a capacity-exceeding disease burden and persistent resource shortages have resulted in traumatic brain injury’s (TBI) devastating impact in low and middle income countries (LMIC). Lifesaving care for TBI depends on accurate and timely decision making within the hospital. As result of technology and highly skilled provider shortages, treatment delays are common in low resource settings. This reality demands a low cost, scalable and accurate alternative to support decision making. Decision support tools leveraging the accuracy of modern prognostic modeling techniques represents one possible solution. This thesis is a collation of research dedicated to the advancement of TBI decision support technology in low resource settings. Methods. The study location included three national and referral hospitals in Uganda and Tanzania. We performed a survival analysis, externally validated existing TBI prognostic models, developed our own prognostic model, and performed a feasibility study for TBI decision support tools in an LMIC. Results. The survival analysis revealed a greater surgical benefit for mild and moderate head injuries compared to severe injuries. However, severe injury patients experienced a higher surgery rate than mild and moderate injuries. We developed a prognostic model using machine learning with a good level of accuracy. This model outperformed existing TBI models in regards to discrimination but not calibration. Our feasibility study captured the need for improved prognostication of TBI patients in the hospital. Conclusions. This pioneering work has provided a foundation for further investigation and implementation of TBI decision support technologies in low resource settings.
Item Open Access A Multi-Disciplinary Systems Approach for Modeling and Predicting Physiological Responses and Biomechanical Movement Patterns(2017) Mazzoleni, MichaelIt is currently an exciting time to be doing research at the intersection of sports and engineering. Advances in wearable sensor technology now enable large quantities of physiological and biomechanical data to be collected from athletes with minimal obstruction and cost. These technological advances, combined with an increased public awareness of the relationship between exercise, fitness, and health, has created an environment where engineering principles can be integrated with biomechanics, exercise physiology, and sports science to dramatically improve methods for physiological assessment, injury prevention, and athletic performance.
The first part of this dissertation develops a new method for analyzing heart rate (HR) and oxygen uptake (VO2) dynamics. A dynamical system model was derived based on the equilibria and stability of the HR and VO2 responses. The model accounts for nonlinear phenomena and person-specific physiological characteristics. A heuristic parameter estimation algorithm was developed to determine model parameters from experimental data. An artificial neural network (ANN) was developed to predict VO2 from HR and exercise intensity data. A series of experiments was performed to validate: 1) the ability of the dynamical system model to make accurate time series predictions for HR and VO2; 2) the ability of the dynamical system model to make accurate submaximal predictions for maximum heart rate (HRmax) and maximal oxygen uptake (VO2max); 3) the ability of the ANN to predict VO2 from HR and exercise intensity data; and 4) the ability of a system comprising an ANN, dynamical system model, and heuristic parameter estimation algorithm to make submaximal predictions for VO2max without requiring VO2 data collection. The dynamical system model was successfully validated through comparisons with experimental data. The model produced accurate time series predictions for HR and VO2 and, more importantly, the model was able to accurately predict HRmax and VO2max using data collected during submaximal exercise. The ANN was successfully able to predict VO2 responses using HR and exercise intensity as system inputs. The system comprising an ANN, dynamical system model, and heuristic parameter estimation algorithm was able to make accurate submaximal predictions for VO2max without requiring VO2 data collection.
The second part of this dissertation applies a support vector machine (SVM) to classify lower extremity movement patterns that are associated with increased lower extremity injury risk. Participants for this study each performed a jump-landing task, and experimental data was collected using two video cameras, two force plates and a chest-mounted single-axis accelerometer. The video data was evaluated to classify the lower extremity movement patterns of the participants as either excellent or poor using the Landing Error Scoring System (LESS) assessment method. Two separate linear SVM classifiers were trained using the accelerometer data and the force plate data, respectively, with the LESS assessment providing the classification labels during training and evaluation. The same participants from this study also performed several bouts of treadmill running, and an additional set of linear SVM classifiers were trained using accelerometer data and gyroscope data to classify movement patterns, with the LESS assessment again providing the classification labels during training and evaluation. Both sets of SVM's performed with a high level of accuracy, and the objective and autonomous nature of the SVM screening methodology eliminates the subjective limitations associated with many current clinical assessment tools.
Item Open Access A Q-Learning Approach to Minefield Characterization from Unmanned Aerial Vehicles(2012) Daugherty, Stephen GreysonThe treasure hunt problem to determine how a computational agent can maximize its ability to detect and/or classify multiple targets located in a region of interest (ROI) populated with multiple obstacles. One particular instance of this problem involves optimizing the performance of a sensor mounted on an unmanned aerial vehicle (UAV) flying over a littoral region in order to detect mines buried underground.
Buried objects (including non-metallic ones) have an effect on the thermal conductivity and heat retention of the soil in which they reside. Because of this, objects that are not very deep below the surface often create measurable thermal anomalies on the surface soil. Because of this, infrared (IR) sensors have the potential to find mines and minelike objects (referred to in this thesis as clutters).
As the sensor flies over the ROI, sensor data is obtained. The sensor receives the data as pixellated infrared light signatures. Using this, ground temperature measurements are recorded and used to generate a two-dimensional thermal profile of the field of view (FOV) and map that profile onto the geography of the ROI.
The input stream of thermal data is then passed to an image processor that estimates the size and shape of the detected target. Then a Bayesian Network (BN) trained from a database of known mines and clutters is used to provide the posterior probability that the evidence obtained by the IR sensor for each detected target was the result of a mine or a clutter. The output is a confidence level (CL), and each target is classified as a mine or a clutter according to the most likely explanation (MLE) for the sensor evidence. Though the sensor may produce incomplete, noisy data, inferences from the BN attenuate the problem.
Since sensor performance depends on altitude and environmental conditions, the value of the IR information can be further improved by choosing the flight path intelligently. This thesis assumes that the UAV is flying through an environmentally homogeneous ROI and addresses the question of how the optimal altitude can be determined for any given multi-dimensional environmental state.
In general, high altitudes result in poor resolution, whereas low altitudes result in very limited FOVs. The problem of weighing these tradeoffs can be addressed by creating a scoring function that is directly dependent on a comparison between sensor outputs and ground truth. The scoring function provides a flexible framework through which multiple mission objectives can be addressed by assigning different weights to correct detections, correct non-detections, false detections, and false non-detections.
The scoring function provides a metric of sensor performance that can be used as feedback to optimize the sensor altitude as a function of the environmental conditions. In turn, the scoring function can be empirically evaluated over a number of different altitudes and then converted to empirical Q scores that also weigh future rewards against immediate ones. These values can be used to train a neural network (NN). The NN filters the data and interpolates between discrete Q-values to provide information about the optimal sensor altitude.
The research described in this thesis can be used to determine the optimal control policy for an aircraft in two different situations. The global maximum of the Q-function can be used to determine the altitude at which a UAV should cruise over an ROI for which the environmental conditions are known a priori. Alternatively, the local maxima of the Q-function can be used to determine the altitude to which a UAV should move if the environmental variables change during flight.
This thesis includes the results of computer simulations of a sensor flying over an ROI. The ROI is populated with targets whose characteristics are based on actual mines and minelike objects. The IR sensor itself is modeled by using a BN to create a stochastic simulation of the sensor performance. The results demonstrate how Q-learning can be applied to signals from a UAV-mounted IR sensor whose data stream is preprocessed by a BN classifier in order to determine an optimal flight policy for a given set of environmental conditions.
Item Open Access A Search for Supersymmetry in Multi-b Jet Events with the ATLAS Detector(2019) Epland, Matthew BergA search for supersymmetry in pair-produced gluinos decaying via top squarks to the lightest neutralino is presented. Events with multiple hadronic jets, of which at least three must be identified as originating from b-quarks, and large amounts of missing transverse energy in the final state, are selected for study. The dataset utilized encompasses proton-proton collisions with a center-of-mass energy of sqrt(s) = 13 TeV and integrated luminosity of 79.9 fb-1 collected by the ATLAS experiment at the LHC from 2015 to 2017. The search employs a parameterized boosted decision tree (BDT) to separate supersymmetric signal events from standard model backgrounds. New methods for optimal BDT parameter point selection and signal region creation, as well as new soft kinematic variables, are exploited to increase the search's expected exclusion limit beyond prior analyses of the same dataset by 100-200 GeV in the gluino and neutralino mass plane. No excess is observed in data above the predicted background, extending the previous exclusion limit at the 95% confidence level by 250 GeV to approximately 1.4 TeV in neutralino mass. The analytical and machine learning techniques developed here will benefit future analysis of additional Run 2 data from 2018.
Item Open Access A Semi-Supervised Predictive Model to Link Regulatory Regions to Their Target Genes(2015) Hafez, Dina MohamedNext generation sequencing technologies have provided us with a wealth of data profiling a diverse range of biological processes. In an effort to better understand the process of gene regulation, two predictive machine learning models specifically tailored for analyzing gene transcription and polyadenylation are presented.
Transcriptional enhancers are specific DNA sequences that act as ``information integration hubs" to confer regulatory requirements on a given cell. These non-coding DNA sequences can regulate genes from long distances, or across chromosomes, and their relationships with their target genes are not limited to one-to-one. With thousands of putative enhancers and less than 14,000 protein-coding genes, detecting enhancer-gene pairs becomes a very complex machine learning and data analysis challenge.
In order to predict these specific-sequences and link them to genes they regulate, we developed McEnhancer. Using DNAseI sensitivity data and annotated in-situ hybridization gene expression clusters, McEnhancer builds interpolated Markov models to learn enriched sequence content of known enhancer-gene pairs and predicts unknown interactions in a semi-supervised learning algorithm. Classification of predicted relationships were 73-98% accurate for gene sets with varying levels of initial known examples. Predicted interactions showed a great overlap when compared to Hi-C identified interactions. Enrichment of known functionally related TF binding motifs, enhancer-associated histone modification marks, along with corresponding developmental time point was highly evident.
On the other hand, pre-mRNA cleavage and polyadenylation is an essential step for 3'-end maturation and subsequent stability and degradation of mRNAs. This process is highly controlled by cis-regulatory elements surrounding the cleavage site (polyA site), which are frequently constrained by sequence content and position. More than 50\% of human transcripts have multiple functional polyA sites, and the specific use of alternative polyA sites (APA) results in isoforms with variable 3'-UTRs, thus potentially affecting gene regulation. Elucidating the regulatory mechanisms underlying differential polyA preferences in multiple cell types has been hindered by the lack of appropriate tests for determining APAs with significant differences across multiple libraries.
We specified a linear effects regression model to identify tissue-specific biases indicating regulated APA; the significance of differences between tissue types was assessed by an appropriately designed permutation test. This combination allowed us to identify highly specific subsets of APA events in the individual tissue types. Predictive kernel-based SVM models successfully classified constitutive polyA sites from a biologically relevant background (auROC = 99.6%), as well as tissue-specific regulated sets from each other. The main cis-regulatory elements described for polyadenylation were found to be a strong, and highly informative, hallmark for constitutive sites only. Tissue-specific regulated sites were found to contain other regulatory motifs, with the canonical PAS signal being nearly absent at brain-specific sites. We applied this model on SRp20 data, an RNA binding protein that might be involved in oncogene activation and obtained interesting insights.
Together, these two models contribute to the understanding of enhancers and the key role they play in regulating tissue-specific expression patterns during development, as well as provide a better understanding of the diversity of post-transcriptional gene regulation in multiple tissue types.
Item Open Access A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results(2018) Coker, BeauInference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theory of inference is neither right nor wrong, but merely an axiom that may or may not be useful. Each of the many diverse theories of inference can be valuable for certain applications. However, no existing theory of inference addresses the tendency to choose, from the range of plausible data analysis specifications consistent with prior evidence, those that inadvertently favor one's own hypotheses. Since the biases from these choices are a growing concern across scientific fields, and in a sense the reason the scientific community was invented in the first place, we introduce a new theory of inference designed to address this critical problem. From this theory, we derive ``hacking intervals,'' which are the range of summary statistic one may obtain given a class of possible endogenous manipulations of the data. They make no appeal to hypothetical data sets drawn from imaginary superpopulations. A scientific result with a small hacking interval is more robust to researcher manipulation than one with a larger interval, and is often easier to interpret than a classic confidence interval. Hacking intervals turn out to be equivalent to classical confidence intervals under the linear regression model, and are equivalent to profile likelihood confidence intervals under certain other conditions, which means they may sometimes provide a more intuitive and potentially more useful interpretation of classical intervals.
Item Open Access Accelerating Probabilistic Computing with a Stochastic Processing Unit(2020) Zhang, XiangyuStatistical machine learning becomes a more important workload for computing systems than ever before. Probabilistic computing is a popular approach in statistical machine learning, which solves problems by iteratively generating samples from parameterized distributions. As an alternative to Deep Neural Networks, probabilistic computing provides conceptually simple, compositional, and interpretable models. However, probabilistic algorithms are often considered too slow on the conventional processors due to sampling overhead to 1) computing the parameters of a distribution and 2) generating samples from the parameterized distribution. A specialized architecture is needed to address both the above aspects.
In this dissertation, we claim a specialized architecture is necessary and feasible to efficiently support various probabilistic computing problems in statistical machine learning, while providing high-quality and robust results.
We start with exploring a probabilistic architecture to accelerate Markov Random Field (MRF) Gibbs Sampling by utilizing the quantum randomness of optical-molecular devices---Resonance Energy Transfer (RET) networks. We provide a macro-scale prototype, the first such system to our knowledge, to experimentally demonstrate the capability of RET devices to parameterize a distribution and run a real application. By doing a quantitative result quality analysis, we further reveal the design issues of an existing RET-based probabilistic computing unit (1st-gen RSU-G) that lead to unsatisfactory result quality in some applications. By exploring the design space, we propose a new RSU-G microarchitecture that empirically achieves the same result quality as 64-bit floating-point software, with the same area and modest power overheads compared with 1st-gen RSU-G. An efficient stochastic probabilistic unit can be fulfilled using RET devices.
The RSU-G provides high-quality true Random Number Generation (RNG). We further explore how quality of an RNG is related to application end-point result quality. Unexpectedly, we discover the target applications do not necessarily require high-quality RNGs---a simple 19-bit Linear-Feedback Shift Register (LFSR) does not degrade end-point result quality in the tested applications. Therefore, we propose a Stochastic Processing Unit (SPU) with a simple pseudo RNG that achieves equivalent function to RSU-G but maintains the benefit of a CMOS digital circuit.
The above results bring up a subsequent question: are we confident to use a probabilistic accelerator with various approximation techniques, even though the end-point result quality ("accuracy") is good in tested benchmarks? We found current methodologies for evaluating correctness of probabilistic accelerators are often incomplete, mostly focusing only on end-point result quality ("accuracy") but omitting other important statistical properties. Therefore, we claim a probabilistic architecture should provide some measure (or guarantee) of statistical robustness. We take a first step toward defining metrics and a methodology for quantitatively evaluating correctness of probabilistic accelerators. We propose three pillars of statistical robustness: 1) sampling quality, 2) convergence diagnostic, and 3) goodness of fit. We apply our framework to a representative MCMC accelerator (SPU) and surface design issues that cannot be exposed using only application end-point result quality. Finally, we demonstrate the benefits of this framework to guide design space exploration in a case study showing that statistical robustness comparable to floating-point software can be achieved with limited precision, avoiding floating-point hardware overheads.
Item Open Access Adaptive Methods for Machine Learning-Based Testing of Integrated Circuits and Boards(2020) Liu, MengyunThe relentless growth in information technology and artificial intelligence (AI) is placing demands on integrated circuits and boards for high performance, added functionality, and low power consumption. As a result, design complexity and integration continue to increase, and emerging devices are being explored. However, these new trends lead to high test cost and challenges associated with semiconductor test.
Machine learning has emerged as a powerful enabler in various application domains, and it provides an opportunity to overcome the challenges associated with expert-based test. Taking the advantages of powerful machine-learning techniques, useful information can be extracted from history testing data, and this information helps facilitate the testing process for both chips and boards.
Moreover, to attain test cost reduction with no test quality degradation, adaptive methods for testing are now being advocated. In conventional testing methods, variations among different chips and different boards are ignored. As a result, the same test items are applied to all chips; online testing is carried out after every fixed interval; immutable fault-diagnosis models are used for all boards. In contrast, adaptive methods observe changes in the distribution of testing data and dynamically adjust the testing process, and hence reduce the test cost. In this dissertation, we study solutions for both chip-level test and board-level test. Our objective is to design the most proper solutions for adapting machine-learning techniques to testing area.
For chip-level test, the dissertation first presents machine learning-based adaptive testing to drop unnecessary test items and reduce the test cost in high-volume chip manufacturing. The proposed testing framework uses the parametric test results from circuit probing test to train a quality-prediction model, partitions chips into different groups based on the predicted quality, and selects the different important test items for each group of chips. To achieve the same defect level as in prior work on adaptive testing, the proposed fine-grained adaptive testing method significantly reduces test cost.
Besides CMOS-based chips, emerging devices (e.g., resistive random access memory (ReRAM)) are being explored to implement AI chips with high energy efficiency. Due to the immature fabrication process, ReRAMs are vulnerable to dynamic faults. Instead of periodically interrupting the computing process and carrying out the testing process, the dissertation presents an efficient method to detect the occurrence of dynamic faults in ReRAM crossbars. This method monitors an indirect measure of the dynamic power consumption of each ReRAM crossbar, determines the occurrence of faults when a changepoint is detected in the monitored power-consumption time series. This model also estimates the percentage of faulty cells in a ReRAM crossbar by training a machine learning-based predictive model. In this way, the time-consuming fault localization and error recovery steps are only carried out when a high defect rate is estimated, and hence the test time is considerably reduced.
For board-level test, the cost associated with the diagnosis and repair due to board-level failures is one of the highest contributors to board manufacturing cost. To reduce the cost associated with fault diagnosis, a machine learning-based diagnosis workflow has been developed to support board-level functional fault identification in the dissertation. In a production environment, the large volume of manufacturing data comes in a streaming format and may exhibit a time-dependent concept drift. In order to process streaming data and adapt to concept drifts, instead of using an immutable diagnosis model, this dissertation also presents the method that uses an online learning algorithm to incrementally update the identification model. Experimental results show that, with the help of online learning, the diagnosis accuracy is improved, and the training time is significantly reduced.
The machine learning-based diagnosis workflow can identify board-level functional faults with high accuracy. However, the prediction accuracy is low when a new board has a limited amount of fail data and repair records. The dissertation presents a diagnosis system that can utilize domain-adaptation algorithms to transfer the knowledge learned from a mature board to a new board. Domain adaptation significantly reduces the requirement for the number of repair records from the new board, while achieving a relatively high diagnostic accuracy in the early stage of manufacturing a new product. The proposed domain adaptation workflow designs a metric to evaluate the similarity between two types of boards. Based on the calculated similarity value, different domain-adaptation algorithms are selected to transfer knowledge and train a diagnosis model.
In summary, this dissertation tackles important problems related to the testing of integrated circuits and boards. By considering variations among different chips or boards, machine learning-based adaptive methods enable the reduction of test cost. The proposed machine learning-based testing methods are expected to contribute to quality assurance and manufacturing-cost reduction in the semiconductor industry.
Item Open Access Advancements in Probabilistic Machine Learning and Causal Inference for Personalized Medicine(2019) Lorenzi, Elizabeth CatherineIn this dissertation, we present four novel contributions to the field of statistics with the shared goal of personalizing medicine to individual patients. These methods are developed to directly address problems in health care through two subfields of statistics: probabilistic machine learning and causal inference. These projects include improving predictions of adverse events after surgeries, or learning the effectiveness of treatments for specific subgroups and for individuals. We begin the dissertation in Chapter 1 with a discussion of personalized medicine, the use of electronic health record (EHR) data, and a brief discussion on learning heterogeneous treatment effects. In chapter 2, we present a novel algorithm, Predictive Hierarchical Clustering (PHC), for agglomerative hierarchical clustering of current procedural terminology (CPT) codes. Our predictive hierarchical clustering aims to cluster subgroups, not individual observations, found within our data, such that the clusters discovered result in optimal performance of a classification model, specifically for predicting surgical complications. In chapter 3, we develop a hierarchical infinite latent factor model (HIFM) to appropriately account for the covariance structure across subpopulations in data. We propose a novel Hierarchical Dirichlet Process shrinkage prior on the loadings matrix that flexibly captures the underlying structure of our data across subpopulations while sharing information to improve inference and prediction. We apply this work to the problem of predicting surgical complications using electronic health record data for geriatric patients at Duke University Health System (DUHS). The last chapters of the dissertation address personalized medicine from a causal perspective, where the goal is to understand how interventions affect individuals not full populations. In chapter 4, we address heterogeneous treatment effects across subgroups, where guidance for observational comparisons within subgroups is lacking as is a connection to classic design principles for propensity score (PS) analyses. We address these shortcomings by proposing a novel propensity score method for subgroup analysis (SGA) that seeks to balance existing strategies in an automatic and efficient way. With the use of overlap weights, we prove that an over-specified propensity model including interactions between subgroups and all covariates results in exact covariate balance within subgroups. This is paired with variable selection approaches to adjust for a possibly overspecified propensity score model. Finally, chapter 5 discusses our final contribution, a longitudinal matching algorithm aiming to predict individual treatment effects of a medication change for diabetes patients. This project aims to develop a novel and generalizable causal inference framework for learning heterogeneous treatment effects from Electronic Health Records (EHR) data. The key methodological innovation is to cast the sparse and irregularly-spaced EHR time series into functional data analysis in the design stage to adjust for confounding that changes over time. We conclude the dissertation and discuss future work in Section 6, outlining many directions for continued research on these topics.
Item Open Access An Investigation into the Bias and Variance of Almost Matching Exactly Methods(2021) Morucci, MarcoThe development of interpretable causal estimation methods is a fundamental problem for high-stakes decision settings in which results must be explainable. Matching methods are highly explainable, but often lack the accuracy of black-box nonparametric models for causal effects. In this work, we propose to investigate theoretically the statistical bias and variance of Almost Matching Exactly (AME) methods for causal effect estimation. These methods aim to overcome the inaccuracy of matching by learning on a separate training dataset an optimal metric to match units on. While these methods are both powerful and interpretable, we currently lack an understanding of their statistical properties. In this work we present a theoretical characterization of the finite-sample and asymptotic properties of AME. We show that AME with discrete data has bounded bias in finite samples, and is asymptotically normal and consistent at a root-n rate. Additionally, we show that AME methods for matching on networked data also have bounded bias and variance in finite-samples, and achieve asymptotic consistency in sparse enough graphs. Our results can be used to motivate the construction of approximate confidence intervals around AME causal estimates, providing a way to quantify their uncertainty.
Item Open Access An Investigation of Machine Learning Methods for Delta-radiomic Feature Analysis(2018) Chang, YushiBackground: Radiomics is a process of converting medical images into high-dimensional quantitative features and the subsequent mining these features for providing decision support. It is conducted as a potential noninvasive, low-cost, and patient-specific routine clinical tool. Building a predictive model which is reliable, efficient, and accurate is a vital part for the success of radiomics. Machine learning method is a powerful tool to achieve this. Feature extraction strongly affects the performance. Delta-feature is one way of feature extraction methods to reflect the spatial variation in tumor phenotype, hence it could provide better treatment-specific assessment.
Purpose: To compare the performance of using pre-treatment features and delta-features for assessing the brain radiosurgery treatment response, and to investigate the performance of different combinations of machine learning methods for feature selection and for feature classification.
Materials and Methods: A cohort of 12 patients with brain treated by radiosurgery was included in this research. The pre-treatment, one-week post-treatment, and two-month post-treatment T1 and T2 FLAIR MR images were acquired. 61 radiomic features were extracted from the gross tumor volume (GTV) of each image. The delta-features from pre-treatment to two post-treatment time points were calculated. With leave-one-out sampling, pre-treatment features and the two sets of delta-features were separately input into a univariate Cox regression model and a machine learning model (L1-regularized logistic regression [L1-LR], random forest [RF] or neural network [NN]) for feature selection. Then a machine learning method (L1-LR, L2-regularized logistic regression [L2-LR], RF, NN, kernel support vector machine [Kernel-SVM], linear-SVM, or naïve bayes [NB]) was used to build a classification model to predict overall survival. The performance of each model combination and feature type was estimated by the area under receiver operating characteristic (ROC) curve (AUC).
Results: The AUC of one-week delta-features was significantly higher than that of pre-treatment features (p-values < 0.0001) and two-month delta-features (p-value= 0.000). The model combinations of L1-LR for feature selection and RF for classification as well as RF for feature selection and NB for classification based on one-week delta-features presented the highest AUC values (both AUC=0.944).
Conclusions: This work potentially implied that the delta-features could be better in predicting treatment response than pre-treatment features, and the time point of computing the delta-features was a vital factor in assessment performance. Analyzing delta-features using a suitable machine learning approach is potentially a powerful tool for assessing treatment response.
Item Open Access Analysis of clinical predictors of kidney diseases in type 2 diabetes patients based on machine learning.(International urology and nephrology, 2022-09) Hui, Dongna; Sun, Yiyang; Xu, Shixin; Liu, Junjie; He, Ping; Deng, Yuhui; Huang, Huaxiong; Zhou, Xiaoshuang; Li, RongshanBackground
The heterogeneity of Type 2 Diabetes Mellitus (T2DM) complicated with renal diseases has not been fully understood in clinical practice. The purpose of the study was to propose potential predictive factors to identify diabetic kidney disease (DKD), nondiabetic kidney disease (NDKD), and DKD superimposed on NDKD (DKD + NDKD) in T2DM patients noninvasively and accurately.Methods
Two hundred forty-one eligible patients confirmed by renal biopsy were enrolled in this retrospective, analytical study. The features composed of clinical and biochemical data prior to renal biopsy were extracted from patients' electronic medical records. Machine learning algorithms were used to distinguish among different kidney diseases pairwise. Feature variables selected in the developed model were evaluated.Results
Logistic regression model achieved an accuracy of 0.8306 ± 0.0057 for DKD and NDKD classification. Hematocrit, diabetic retinopathy (DR), hematuria, platelet distribution width and history of hypertension were identified as important risk factors. Then SVM model allowed us to differentiate NDKD from DKD + NDKD with accuracy 0.8686 ± 0.052 where hematuria, diabetes duration, international normalized ratio (INR), D-Dimer, high-density lipoprotein cholesterol were the top risk factors. Finally, the logistic regression model indicated that DD-dimer, hematuria, INR, systolic pressure, DR were likely to be predictive factors to identify DKD with DKD + NDKD.Conclusion
Predictive factors were successfully identified among different renal diseases in type 2 diabetes patients via machine learning methods. More attention should be paid on the coagulation factors in the DKD + NDKD patients, which might indicate a hypercoagulable state and an increased risk of thrombosis.Item Open Access Appearance-based Gaze Estimation and Applications in Healthcare(2020) Chang, ZhuoqingGaze estimation, the ability to predict where a person is looking, has become an indispensable technology in healthcare research. Current tools for gaze estimation rely on specialized hardware and are typically used in well-controlled laboratory settings. Novel appearance-based methods directly estimate a person's gaze from the appearance of their eyes, making gaze estimation possible with ubiquitous, low-cost devices, such as webcams and smartphones. This dissertation presents new methods on appearance-based gaze estimation as well as applying this technology to solve challenging problems in practical healthcare applications.
One limitation of appearance-based methods is the need to collect a large amount of training data to learn the highly variant eye appearance space. To address this fundamental issue, we develop a method to synthesize novel images of the eye using data from a low-cost RGB-D camera and show that this data augmentation technique can improve gaze estimation accuracy significantly. In addition, we explore the potential of utilizing visual saliency information as a means to transparently collect weakly-labelled gaze data at scale. We show that the collected data can be used to personalize a generic gaze estimation model to achieve better performance on an individual.
In healthcare applications, the possibility of replacing specialized hardware with ubiquitous devices when performing eye-gaze analysis is a major asset that appearance-based methods brings to the table. In the first application, we assess the risk of autism in toddlers by analyzing videos of them watching a set of expert-curated stimuli on a mobile device. We show that appearance-based methods can be used to estimate their gaze position on the device screen and that differences between the autistic and typically-developing populations are significant. In the second application, we attempt to detect oculomotor abnormalities in people with cerebellar ataxia using video recorded from a mobile phone. By tracking the iris movement of participants while they watch a short video stimuli, we show that we are able to achieve high sensitivity and specificity in differentiating people with smooth pursuit oculomotor abnormalities from those without.
Item Open Access Application of Stochastic Processes in Nonparametric Bayes(2014) Wang, YingjianThis thesis presents theoretical studies of some stochastic processes and their appli- cations in the Bayesian nonparametric methods. The stochastic processes discussed in the thesis are mainly the ones with independent increments - the Levy processes. We develop new representations for the Levy measures of two representative exam- ples of the Levy processes, the beta and gamma processes. These representations are manifested in terms of an infinite sum of well-behaved (proper) beta and gamma dis- tributions, with the truncation and posterior analyses provided. The decompositions provide new insights into the beta and gamma processes (and their generalizations), and we demonstrate how the proposed representation unifies some properties of the two, as these are of increasing importance in machine learning.
Next a new Levy process is proposed for an uncountable collection of covariate- dependent feature-learning measures; the process is called the kernel beta process. Available covariates are handled efficiently via the kernel construction, with covari- ates assumed observed with each data sample ("customer"), and latent covariates learned for each feature ("dish"). The dependencies among the data are represented with the covariate-parameterized kernel function. The beta process is recovered as a limiting case of the kernel beta process. An efficient Gibbs sampler is developed for computations, and state-of-the-art results are presented for image processing and music analysis tasks.
Last is a non-Levy process example of the multiplicative gamma process applied in the low-rank representation of tensors. The multiplicative gamma process is applied along the super-diagonal of tensors in the rank decomposition, with its shrinkage property nonparametrically learns the rank from the multiway data. This model is constructed as conjugate for the continuous multiway data case. For the non- conjugate binary multiway data, the Polya-Gamma auxiliary variable is sampled to elicit closed-form Gibbs sampling updates. This rank decomposition of tensors driven by the multiplicative gamma process yields state-of-art performance on various synthetic and benchmark real-world datasets, with desirable model scalability.
Item Open Access Applications of Deep Learning, Machine Learning, and Remote Sensing to Improving Air Quality and Solar Energy Production(2021) Zheng, TongshuExposure to higher PM2.5 can lead to increased risks of mortality; however, the spatial concentrations of PM2.5 are not well characterized, even in megacities, due to the sparseness of regulatory air quality monitoring (AQM) stations. This motivates novel low-cost methods to estimate ground-level PM2.5 at a fine spatial resolution so that PM2.5 exposure in epidemiological research can be better quantified and local PM2.5 hotspots at a community-level can be automatically identified. Wireless low-cost particulate matter sensor network (WLPMSN) is among these novel low-cost methods that transform air quality monitoring by providing PM information at finer spatial and temporal resolutions; however, large-scale WLPMSN calibration and maintenance remain a challenge because the manual labor involved in initial calibration by collocation and routine recalibration is intensive, the transferability of the calibration models determined from initial collocation to new deployment sites is questionable, as calibration factors typically vary with urban heterogeneity of operating conditions and aerosol optical properties, and the stability of low-cost sensors can drift or degrade over time. This work presents a simultaneous Gaussian Process regression (GPR) and simple linear regression pipeline to calibrate and monitor dense WLPMSNs on the fly by leveraging all available reference monitors across an area without resorting to pre-deployment collocation calibration. We evaluated our method for Delhi, where the PM2.5 measurements of all 22 regulatory reference and 10 low-cost nodes were available for 59 days from January 1, 2018 to March 31, 2018 (PM2.5 averaged 138 ± 31 μg m-3 among 22 reference stations), using a leave-one-out cross-validation (CV) over the 22 reference nodes. We showed that our approach can achieve an overall 30 % prediction error (RMSE: 33 μg m-3) at a 24 h scale and is robust as underscored by the small variability in the GPR model parameters and in the model-produced calibration factors for the low-cost nodes among the 22-fold CV. Of the 22 reference stations, high-quality predictions were observed for those stations whose PM2.5 means were close to the Delhi-wide mean (i.e., 138 ± 31 μg m-3) and relatively poor predictions for those nodes whose means differed substantially from the Delhi-wide mean (particularly on the lower end). We also observed washed-out local variability in PM2.5 across the 10 low-cost sites after calibration using our approach, which stands in marked contrast to the true wide variability across the reference sites. These observations revealed that our proposed technique (and more generally the geostatistical technique) requires high spatial homogeneity in the pollutant concentrations to be fully effective. We further demonstrated that our algorithm performance is insensitive to training window size as the mean prediction error rate and the standard error of the mean (SEM) for the 22 reference stations remained consistent at ~30 % and ~3–4 % when an increment of 2 days’ data were included in the model training. The markedly low requirement of our algorithm for training data enables the models to always be nearly most updated in the field, thus realizing the algorithm’s full potential for dynamically surveilling large-scale WLPMSNs by detecting malfunctioning low-cost nodes and tracking the drift with little latency. Our algorithm presented similarly stable 26–34 % mean prediction errors and ~3–7 % SEMs over the sampling period when pre-trained on the current week’s data and predicting 1 week ahead, therefore suitable for online calibration. Simulations conducted using our algorithm suggest that in addition to dynamic calibration, the algorithm can also be adapted for automated monitoring of large-scale WLPMSNs. In these simulations, the algorithm was able to differentiate malfunctioning low-cost nodes (due to either hardware failure or under heavy influence of local sources) within a network by identifying aberrant model-generated calibration factors (i.e., slopes close to zero and intercepts close to the Delhi-wide mean of true PM2.5). The algorithm was also able to track the drift of low-cost nodes accurately within 4 % error for all the simulation scenarios. The simulation results showed that ~20 reference stations are optimum for our solution in Delhi and confirmed that low-cost nodes can extend the spatial precision of a network by decreasing the extent of pure interpolation among only reference stations. Our solution has substantial implications in reducing the amount of manual labor for the calibration and surveillance of extensive WLPMSNs, improving the spatial comprehensiveness of PM evaluation, and enhancing the accuracy of WLPMSNs. Satellite-based ground-level PM2.5 modeling is another such low-cost method. Satellite-retrieved aerosol products are in particular widely used to estimate the spatial distribution of ground-level PM2.5. However, these aerosol products can be subject to large uncertainties due to many approximations and assumptions made in multiple stages of their retrieval algorithms. Therefore, estimating ground-level PM2.5 directly from satellites (e.g., satellite images) by skipping the intermediate step of aerosol retrieval can potentially yield lower errors because it avoids retrieval error propagating into PM2.5 estimation and is desirable compared to current ground-level PM2.5 retrieval methods. Additionally, the spatial resolutions of estimated PM2.5 are usually constrained by those of the aerosol products and are currently largely at a comparatively coarse 1 km or greater resolution. Such coarse spatial resolutions are unable to support scientific studies that thrive on highly spatially-resolved PM2.5. These limitations have motivated us to devise a computer vision algorithm for estimating ground-level PM2.5 at a high spatiotemporal resolution by directly processing the global-coverage, daily, near real-time updated, 3 m/pixel resolution, three-band micro-satellite imagery of spatial coverages significantly smaller than 1 × 1 km (e.g., 200 × 200 m) available from Planet Labs. In this study, we employ a deep convolutional neural network (CNN) to process the imagery by extracting image features that characterize the day-to-day dynamic changes in the built environment and more importantly the image colors related to aerosol loading, and a random forest (RF) regressor to estimate PM2.5 based on the extracted image features along with meteorological conditions. We conducted the experiment on 35 AQM stations in Beijing over a period of ~3 years from 2017 to 2019. We trained our CNN-RF model on 10,400 available daily images of the AQM stations labeled with the corresponding ground-truth PM2.5 and evaluated the model performance on 2622 holdout images. Our model estimates ground-level PM2.5 accurately at a 200 m spatial resolution with a mean absolute error (MAE) as low as 10.1 μg m-3 (equivalent to 23.7% error) and Pearson and Spearman r scores up to 0.91 and 0.90, respectively. Our trained CNN from Beijing is then applied to Shanghai, a similar urban area. By quickly retraining only RF but not CNN on the new Shanghai imagery dataset, our model estimates Shanghai 10 AQM stations’ PM2.5 accurately with a MAE and both Pearson and Spearman r scores of 7.7 μg m-3 (18.6% error) and 0.85, respectively. The finest 200 m spatial resolution of ground-level PM2.5 estimates from our model in this study is higher than the vast majority of existing state-of-the-art satellite-based PM2.5 retrieval methods. And our 200 m model’s estimation performance is also at the high end of these state-of-the-art methods. Our results highlight the potential of augmenting existing spatial predictors of PM2.5 with high-resolution satellite imagery to enhance the spatial resolution of PM2.5 estimates for a wide range of applications, including pollutant emission hotspot determination, PM2.5 exposure assessment, and fusion of satellite remote sensing and low-cost air quality sensor network information. We later, however, found out that this CNN-RF sequential model, despite effectively capturing spatial variations, yields higher average PM2.5 prediction errors than its RF part alone using only meteorological conditions, most likely the result of CNN-RF sequential model being unable to fully use the information in satellite images in the presence of meteorological conditions. To break this bottleneck in PM2.5 prediction performance, we reformulated the previous CNN-RF sequential model into a RF-CNN joint model that adopts a residual learning ideology that forces the CNN part to most effectively exploit the information in satellite images that is only “orthogonal” to meteorology. The RF-CNN joint model achieved low normalized root mean square error for PM2.5 of within ~31% and normalized mean absolute error of within ~19% on the holdout samples in both Delhi and Beijing, better than the performances of both the CNN-RF sequential model and the RF part alone using only meteorological conditions. To date, few studies have used their simulated ambient PM2.5 to detect hotspots. Furthermore, even the hotspots studied in these very limited works are all “global” hotspots that have the absolute highest PM2.5 levels in the whole study region. Little is known about “local” hotspots that have the highest PM2.5 only relative to their neighbors at fine-scale community levels, even though the disparities in outdoor PM2.5 exposures and their associated risks of mortality between populations in local hotspots and coolspots within the same communities can be rather large. These limitations motivated us to concatenate a local contrast normalization (LCN) algorithm at the end of the RF-CNN joint model to automatically reveal local PM2.5 hotspots from the estimated PM2.5 maps. The RF-CNN-LCN pipeline reasonably predicts urban PM2.5 local hotspots and coolspots by capturing both the main intra-urban spatial trends in PM2.5 and the local variations in PM2.5 with urban landscape, with local hotspots relating to compact urban spatial structures while coolspots being open areas and green spaces. Based on 20 sampled representative neighborhoods in Delhi, our pipeline revealed that on average a significant 9.2 ± 4.0 μg m-3 long-term PM2.5 exposure difference existed between the local hotspots and coolspots within the same community, with Indian Gandhi International Airport area having the steepest increase of 20.3 μg m-3 from the coolest spot (the residential area immediately outside the airport) to the hottest spot (airport runway). This work provides a possible means of automatically identifying local PM2.5 hotspots at 300 m in heavily polluted megacities. It highlights the potential existence of substantial health inequalities in long-term outdoor PM2.5 exposures within even the same local neighborhoods between local hotspots and coolspots. Apart from posing serious health risks, deposition of dust and anthropogenic particulate matter (PM) on solar photovoltaics (PVs), known as soiling, can diminish solar energy production appreciably. As of 2018, the global cumulative PV capacity crossed 500 GW, of which at least 3–4% was estimated to be lost due to soiling, equivalent to ~4–6 billion USD revenue losses. In the context of a projected ~16-fold increase of global solar capacity to 8.5 TW by 2050, soiling will play an increasingly more important part in estimating and forecasting the performance and economics of solar PV installations. However, reliable soiling information is currently lacking because the existing soiling monitoring systems are expensive. This work presents a low-cost remote sensing algorithm that estimates utility-scale solar farms’ daily solar energy loss due to PV soiling by directly processing the daily (near real-time updated), 3 m/pixel resolution, and global coverage micro-satellite surface reflectance (SR) analytic product from the commercial satellite company Planet. We demonstrate that our approach can estimate daily soiling loss for a solar farm in Pune, India over three years that on average caused ~5.4% reduction in solar energy production. We further estimated that around 437 MWh solar energy was lost in total over the 3 years, equivalent to ~11799 USD, at this solar farm. Our approach’s average soiling estimation matches perfectly with the ~5.3% soiling loss reported by a previous published model for this solar farm site. Compared to other state-of-the-art PV soiling modeling approaches, the proposed unsupervised approach has the benefit of estimating PV soiling at a precisely solar farm level (as in contrast to coarse regional modeling for only large spatial grids in which a solar farm resides) and at an unprecedently high temporal resolution (i.e., 1 day) without resorting to solar farms’ proprietary solar energy generation data or knowledge about the specific components of deposited PM or these species’ dry deposition flux and other physical properties. Our approach allows solar farm owners to keep close track of the intensity of soiling at their sites and perform panel cleaning operations more strategically rather than based on a fixed schedule.
Item Open Access Applications of Topological Data Analysis and Sliding Window Embeddings for Learning on Novel Features of Time-Varying Dynamical Systems(2017) Ghadyali, Hamza MustafaThis work introduces geometric and topological data analysis (TDA) tools that can be used in conjunction with sliding window transformations, also known as delay-embeddings, for discovering structure in time series and dynamical systems in an unsupervised or supervised learning framework. For signals of unknown period, we introduce an intuitive topological method to discover the period, and we demonstrate its use in synthetic examples and real temperature data. Alternatively, for almost-periodic signals of known period, we introduce a metric called Geometric Complexity of an Almost Periodic signal (GCAP), based on a topological construction, which allows us to continuously measure the evolving variation of its periods. We apply this method to temperature data collected from over 200 weather stations in the United States and describe the novel patterns that we observe. Next, we show how geometric and TDA tools can be used in a supervised learning framework. Seizure-detection using electroencephalogram (EEG) data is formulated as a binary classification problem. We define new collections of geometric and topological features of multi-channel data, which utilizes temporal and spatial context of EEG, and show how it results in better overall performance of seizure detection than using the usual time-domain and frequency domain features. Finally, we introduce a novel method to sonify persistence diagrams, and more generally any planar point cloud, using a modified version of the harmonic table. This auditory display can be useful for finding patterns that visual analysis alone may miss.
Item Open Access Applying Machine Learning to Testing and Diagnosis of Integrated Systems(2021) Pan, RenjianThe growing complexity of integrated boards and systems makes manufacturing test and diagnosis increasingly expensive. There is a pressing need to reduce test cost and to pinpoint the root causes of integrated systems in a more effective way. In light of machine learning, a number of intelligent test-cost reduction and root-cause analysis methods have been proposed. However, it remains extremely challenging to (i) reduce test cost for black-box testing for integrated systems, and (ii) pinpoint the root causes for integrated systems with little need on labeled test data from repair history. To tackle these challenges, we propose multiple machine-learning-based solutions for black-box test-cost reduction and unsupervised/semi-supervised root-cause analysis in this dissertation.For black-box test-cost reduction, we propose a novel test selection method based on a Bayesian network model. First, it is formulated as a constrained optimization problem. Next, a score-based algorithm is implemented to construct the Bayesian network for black-box tests. Finally, we propose a Bayesian index with the property of Markov blankets, and then an iterative test selection method is developed based on our proposed Bayesian index. For root-cause analysis, we first propose an unsupervised root-cause analysis method in which no repair history is needed. In the first stage, a decision-tree model is trained with system test information to cluster the data in a coarse-grained manner. In the second stage, frequent-pattern mining is applied to extract frequent patterns in each decision-tree node to precisely cluster the data so that each cluster represents only a small number of root causes. The proposed method can accommodate both numerical and categorical test items. A combination of the L-method, cross validation and Silhouette score enables us to automatically determine all hyper-parameters. Two industry case studies with system test data demonstrate that the proposed approach significantly outperforms the state-of-the-art unsupervised root-cause-analysis method. Utilizing transfer learning, we further improve the performance of unsupervised root-cause-analysis. A two-stage clustering method is first developed by exploiting model selection based on the concept of Silhouette score. Next, a data-selection method based on ensemble learning is proposed to transfer valuable information from a source product to improve the diagnosis accuracy on the target product with insufficient data. Two case studies based on industry designs demonstrate that the proposed approach significantly outperforms other state-of-the-art unsupervised root-cause-analysis methods. In addition, we propose a semi-supervised root-cause-analysis method with co-training, where only a small set of labeled data is required. Using random forest as the learning kernel, a co-training technique is proposed to leverage the unlabeled data by automatically pre-labeling a subset of them and retraining each decision tree. In addition, several novel techniques have been proposed to avoid over-fitting and determine hyper-parameters. Two case studies based on industrial designs demonstrate that the proposed approach significantly outperforms the state-of-the-art methods. In summary, this dissertation addresses the most difficult problems in testing and diagnosis of integrated systems with machine learning. A test selection method based on Bayesian networks reduces the test cost for black-box testing. With unsupervised learning, semi-supervised learning and transfer learning, we analysis root causes for integrated systems without much need on historical diagnosis information. The proposed approaches are expected to contribute to the semiconductor industry by effectively reducing the black-box test cost and efficiently diagnosing the integrated systems.