Browsing by Author "Sapiro, Guillermo"
Results Per Page
Sort Options
Item Open Access A Six-Minute Measure of Vocalizations in Toddlers with Autism Spectrum Disorder(Autism Research) Tenenbaum, Elena J; Carpenter, Kimberly LH; Sabatos-DeVito, Maura; Hashemi, Jordan; Vermeer, Saritha; Sapiro, Guillermo; Dawson, GeraldineItem Open Access A tablet-based game for the assessment of visual motor skills in autistic children.(NPJ digital medicine, 2023-02) Perochon, Sam; Matias Di Martino, J; Carpenter, Kimberly LH; Compton, Scott; Davis, Naomi; Espinosa, Steven; Franz, Lauren; Rieder, Amber D; Sullivan, Connor; Sapiro, Guillermo; Dawson, GeraldineIncreasing evidence suggests that early motor impairments are a common feature of autism. Thus, scalable, quantitative methods for measuring motor behavior in young autistic children are needed. This work presents an engaging and scalable assessment of visual-motor abilities based on a bubble-popping game administered on a tablet. Participants are 233 children ranging from 1.5 to 10 years of age (147 neurotypical children and 86 children diagnosed with autism spectrum disorder [autistic], of which 32 are also diagnosed with co-occurring attention-deficit/hyperactivity disorder [autistic+ADHD]). Computer vision analyses are used to extract several game-based touch features, which are compared across autistic, autistic+ADHD, and neurotypical participants. Results show that younger (1.5-3 years) autistic children pop the bubbles at a lower rate, and their ability to touch the bubble's center is less accurate compared to neurotypical children. When they pop a bubble, their finger lingers for a longer period, and they show more variability in their performance. In older children (3-10-years), consistent with previous research, the presence of co-occurring ADHD is associated with greater motor impairment, reflected in lower accuracy and more variable performance. Several motor features are correlated with standardized assessments of fine motor and cognitive abilities, as evaluated by an independent clinical assessment. These results highlight the potential of touch-based games as an efficient and scalable approach for assessing children's visual-motor skills, which can be part of a broader screening tool for identifying early signs associated with autism.Item Open Access Appearance-based Gaze Estimation and Applications in Healthcare(2020) Chang, ZhuoqingGaze estimation, the ability to predict where a person is looking, has become an indispensable technology in healthcare research. Current tools for gaze estimation rely on specialized hardware and are typically used in well-controlled laboratory settings. Novel appearance-based methods directly estimate a person's gaze from the appearance of their eyes, making gaze estimation possible with ubiquitous, low-cost devices, such as webcams and smartphones. This dissertation presents new methods on appearance-based gaze estimation as well as applying this technology to solve challenging problems in practical healthcare applications.
One limitation of appearance-based methods is the need to collect a large amount of training data to learn the highly variant eye appearance space. To address this fundamental issue, we develop a method to synthesize novel images of the eye using data from a low-cost RGB-D camera and show that this data augmentation technique can improve gaze estimation accuracy significantly. In addition, we explore the potential of utilizing visual saliency information as a means to transparently collect weakly-labelled gaze data at scale. We show that the collected data can be used to personalize a generic gaze estimation model to achieve better performance on an individual.
In healthcare applications, the possibility of replacing specialized hardware with ubiquitous devices when performing eye-gaze analysis is a major asset that appearance-based methods brings to the table. In the first application, we assess the risk of autism in toddlers by analyzing videos of them watching a set of expert-curated stimuli on a mobile device. We show that appearance-based methods can be used to estimate their gaze position on the device screen and that differences between the autistic and typically-developing populations are significant. In the second application, we attempt to detect oculomotor abnormalities in people with cerebellar ataxia using video recorded from a mobile phone. By tracking the iris movement of participants while they watch a short video stimuli, we show that we are able to achieve high sensitivity and specificity in differentiating people with smooth pursuit oculomotor abnormalities from those without.
Item Open Access Automatic Behavioral Analysis from Faces and Applications to Risk Marker Quantification for Autism(2018) Hashemi, JordanThis dissertation presents novel methods for behavioral analysis with a focus on early risk marker identification for autism. We present current contributions including a method for pose-invariant facial expression recognition, a self-contained mobile application for behavioral analysis, and a framework to calibrate a trained deep model with data synthesis and augmentation. First we focus on pose-invariant facial expression recognition. It is known that 3D features have higher discrimination power than 2D features; however, usually 3D features are not readily available at testing time. For pose-invariant facial expression recognition, we utilize multi-modal features at training and exploit the cross-modal relationship at testing. We extend our pose-invariant facial expression recognition method and present other methods to characterize a multitude of risk behaviors related to risk marker identification for autism. In practice, identification of children with neurodevelopmental disorders requires low specificity screening with questionnaires followed by time-consuming, in-person observational analysis by highly-trained clinicians. To alleviate the time and resource expensive risk identification process, we develop a self-contained, closed- loop, mobile application that records a child’s face while he/she is watching specific, expertly-curated movie stimuli and automatically analyzes the behavioral responses of the child. We validate our methods to those of expert human raters. Using the developed methods, we present findings on group differences for behavioral risk markers for autism and interactions between motivational framing context, facial affect, and memory outcome. Lastly, we present a framework to use face synthesis to calibrate trained deep models to deployment scenarios that they have not been trained on. Face synthesis involves creating novel realizations of an image of a face and is an effective method that is predominantly employed only at training and in a blind manner (e.g., blindly synthesize as much as possible). We present a framework that optimally select synthesis variations and employs it both during training and at testing, leading to more e cient training and better performance.
Item Open Access Automatic emotion and attention analysis of young children at home: a ResearchKit autism feasibility study(npj Digital Medicine, 2018-12) Egger, Helen L; Dawson, Geraldine; Hashemi, Jordan; Carpenter, Kimberly LH; Espinosa, Steven; Campbell, Kathleen; Brotkin, Samuel; Schaich-Borg, Jana; Qiu, Qiang; Tepper, Mariano; Baker, Jeffrey P; Bloomfield, Richard A; Sapiro, GuillermoItem Open Access Coded aperture compressive temporal imaging.(Opt Express, 2013-05-06) Llull, Patrick; Liao, Xuejun; Yuan, Xin; Yang, Jianbo; Kittle, David; Carin, Lawrence; Sapiro, Guillermo; Brady, David JWe use mechanical translation of a coded aperture for code division multiple access compression of video. We discuss the compressed video's temporal resolution and present experimental results for reconstructions of > 10 frames of temporal data per coded snapshot.Item Open Access Computational Analysis of Clinical Brain Sub-cortical Structures from Ultrahigh-Field MRI(2015) Kim, JinyoungVolumetric segmentation of brain sub-cortical structures within the basal ganglia and thalamus from Magnetic Resonance Image (MRI) is necessary for non-invasive diagnosis and neurosurgery planning. This is a challenging problem due in part to limited boundary information between structures, similar intensity profiles across the different structures, and low contrast data. With recent advances in ultrahigh-field MR technology, direct identification and clear visualization of such brain sub-cortical structures are facilitated. This dissertation first presents a semi-automatic segmentation system exploiting the visual benefits of ultrahigh-field MRI. The proposed approach utilizes the complementary edge information in the multiple structural MRI modalities. It combines optimally selected two modalities from susceptibility-weighted, T2-weighted, and diffusion MRI, and introduces a tailored new edge indicator function. In addition to this, prior shape and configuration knowledge of the sub-cortical structures are employed in order to guide the evolution of geometric active surfaces. Neighboring structures are segmented iteratively, constraining over-segmentation at their borders with a non-overlapping penalty. Experiments with data acquired on a 7 Tesla (T) MRI scanner demonstrate the feasibility and power of the approach for the segmentation of basal ganglia components critical for neurosurgery applications such as Deep Brain Stimulation (DBS) surgery.
DBS surgery on brain sub-cortical regions within the Basal ganglia and thalamus is an effective treatment to alleviate symptoms of neuro-degenerative diseases. Particularly, the DBS of subthalamic nucleus (STN) has shown important clinical efficacy for Parkinson’s disease (PD). While accurate localization of the STN and its substructures is critical for precise DBS electrode placement, direct visualization of the STN in current standard clinical MR imaging (e.g., 1.5-3T) is still elusive. Therefore, to locate the target, DBS surgeons today often rely on consensus coordinates, lengthy and risky micro-electrode recording (MER), and patient’s behavioral feedback. Recently, ultrahigh-field MR imaging allows direct visualization of brain sub-cortical structures. However, such high fields are not clinically available in practice. This dissertation also introduces a non-invasive automatic localization method of the STN which is one of the critical targets for DBS surgery in a standard clinical scenario (1.5T MRI). The spatial dependency between the STN and potential predictor structures from 7T MR training data is first learned using the regression models in a bagging way. Then, given automatically detected such predictors on the clinical patient data, the complete region of the STN is predicted as a probability map using learned high quality information from 7T. Furthermore, a robust framework is proposed to properly weight different training subsets, estimating their influence in the prediction accuracy. The STN prediction on the clinical 1.5T MR datasets from 15 PD patients is performed within the proposed approach. Experimental results demonstrate that the developed framework enables accurate prediction of the STN, closely matching the 7T ground truth.
Item Open Access Computer vision tools for low-cost and noninvasive measurement of autism-related behaviors in infants.(Autism Res Treat, 2014) Hashemi, Jordan; Tepper, Mariano; Vallin Spina, Thiago; Esler, Amy; Morellas, Vassilios; Papanikolopoulos, Nikolaos; Egger, Helen; Dawson, Geraldine; Sapiro, GuillermoThe early detection of developmental disorders is key to child outcome, allowing interventions to be initiated which promote development and improve prognosis. Research on autism spectrum disorder (ASD) suggests that behavioral signs can be observed late in the first year of life. Many of these studies involve extensive frame-by-frame video observation and analysis of a child's natural behavior. Although nonintrusive, these methods are extremely time-intensive and require a high level of observer training; thus, they are burdensome for clinical and large population research purposes. This work is a first milestone in a long-term project on non-invasive early observation of children in order to aid in risk detection and research of neurodevelopmental disorders. We focus on providing low-cost computer vision tools to measure and identify ASD behavioral signs based on components of the Autism Observation Scale for Infants (AOSI). In particular, we develop algorithms to measure responses to general ASD risk assessment tasks and activities outlined by the AOSI which assess visual attention by tracking facial features. We show results, including comparisons with expert and nonexpert clinicians, which demonstrate that the proposed computer vision tools can capture critical behavioral observations and potentially augment the clinician's behavioral observations obtained from real in-clinic assessments.Item Open Access Computer vision tools for the non-invasive assessment of autism-related behavioral markersHashemi, Jordan; Spina, Thiago Vallin; Tepper, Mariano; Esler, Amy; Morellas, Vassilios; Papanikolopoulos, Nikolaos; Sapiro, GuillermoThe early detection of developmental disorders is key to child outcome, allowing interventions to be initiated that promote development and improve prognosis. Research on autism spectrum disorder (ASD) suggests behavioral markers can be observed late in the first year of life. Many of these studies involved extensive frame-by-frame video observation and analysis of a child's natural behavior. Although non-intrusive, these methods are extremely time-intensive and require a high level of observer training; thus, they are impractical for clinical and large population research purposes. Diagnostic measures for ASD are available for infants but are only accurate when used by specialists experienced in early diagnosis. This work is a first milestone in a long-term multidisciplinary project that aims at helping clinicians and general practitioners accomplish this early detection/measurement task automatically. We focus on providing computer vision tools to measure and identify ASD behavioral markers based on components of the Autism Observation Scale for Infants (AOSI). In particular, we develop algorithms to measure three critical AOSI activities that assess visual attention. We augment these AOSI activities with an additional test that analyzes asymmetrical patterns in unsupported gait. The first set of algorithms involves assessing head motion by tracking facial features, while the gait analysis relies on joint foreground segmentation and 2D body pose estimation in video. We show results that provide insightful knowledge to augment the clinician's behavioral observations obtained from real in-clinic assessments.Item Open Access Creating and parameterizing patient-specific deep brain stimulation pathway-activation models using the hyperdirect pathway as an example.(PloS one, 2017-01) Gunalan, Kabilar; Chaturvedi, Ashutosh; Howell, Bryan; Duchin, Yuval; Lempka, Scott F; Patriat, Remi; Sapiro, Guillermo; Harel, Noam; McIntyre, Cameron CBackground
Deep brain stimulation (DBS) is an established clinical therapy and computational models have played an important role in advancing the technology. Patient-specific DBS models are now common tools in both academic and industrial research, as well as clinical software systems. However, the exact methodology for creating patient-specific DBS models can vary substantially and important technical details are often missing from published reports.Objective
Provide a detailed description of the assembly workflow and parameterization of a patient-specific DBS pathway-activation model (PAM) and predict the response of the hyperdirect pathway to clinical stimulation.Methods
Integration of multiple software tools (e.g. COMSOL, MATLAB, FSL, NEURON, Python) enables the creation and visualization of a DBS PAM. An example DBS PAM was developed using 7T magnetic resonance imaging data from a single unilaterally implanted patient with Parkinson's disease (PD). This detailed description implements our best computational practices and most elaborate parameterization steps, as defined from over a decade of technical evolution.Results
Pathway recruitment curves and strength-duration relationships highlight the non-linear response of axons to changes in the DBS parameter settings.Conclusion
Parameterization of patient-specific DBS models can be highly detailed and constrained, thereby providing confidence in the simulation predictions, but at the expense of time demanding technical implementation steps. DBS PAMs represent new tools for investigating possible correlations between brain pathway activation patterns and clinical symptom modulation.Item Open Access Geometric Multimedia Time Series(2017) Tralie, Christopher JohnThis thesis provides a new take on problems in multimedia times series analysis by using a shape-based perspective to quantify patterns in time, which is complementary to more traditional analysis-based time series techniques. Inspired by the dynamical systems community, we turn time series into shapes via sliding window embeddings, which we refer to as ``time-ordered point clouds'' (TOPCs). This framework has traditionally been used on a single 1D observation function for deterministic systems, but we generalize the sliding window technique so that it not only applies to multivariate data (e.g. videos), but that it also applies to data which is not stationary (e.g. music).
The geometry of our time-ordered point clouds can be quite informative. For periodic signals, the point clouds fill out topological loops, which, depending on harmonic content, reside on various high dimensional tori. For quasiperiodic signals, the point clouds are dense on a torus. We use modern tools from topological data analysis (TDA) to quantify degrees of periodicity and quasiperiodicity by looking at these shapes, and we show that this can be used to detect anomalies in videos of vibrating vocal folds. In the case of videos, this has the advantage of substantially reducing the amount of preprocessing, as no motion tracking is needed, and the technique operates on raw pixels. This is also one of the first known uses of persistent H2 in a high dimensional setting.
Periodic processes represent only a sliver of possible dynamics, and we also show that sequences of arbitrary normalized sliding window point clouds are approximately isometric between ``cover songs,'' or different versions of the same song, possibly with radically different spectral content. Surprisingly, in this application, an incredibly simple geometric descriptor based on self-similarity matrices performs the best, and it also enables us to use MFCC features for this task, which was previously thought not to be possible due to significant timbral differences that can exist between versions. When combined with traditional pitch-based features using similarity metric fusion, we obtain state of the art results on automatic cover song identification.
In addition to being used as a geometric descriptor, self-similarity matrices provide a unifying description of phenomena in time-ordered point clouds throughout our work, and we use them to illustrate properties such as recurrence, mirror symmetry in time, and harmonics in periodic processes. They also provide the base representation for designing isometry blind time warping algorithms, which we use to synchronize time-ordered point clouds that are shifted versions of each other in space without ever having to do a spatial alignment. In particular, we devise an algorithm that lower bounds the 1-stress between two time-ordered point clouds, which is related to the Gromov-Hausdorff distance.
Overall, we show a proof-of-concept and promise of the nascent field of geometric signal processing, which is worthy of further study in applications of music structure, multimodal data analysis, and video analysis.
Item Open Access Improved Visualization and Quantification for Hyperpolarized 129Xe MRI(2019) He, MuIn Pulmonary diseases, such as chronic obstructed pulmonary diseases (COPD), fibrosis, and asthma, are responsible for substantial health and financial burden in the world. In 2016, COPD claimed more than 3 million lives, which is also the 3rd leading cause of mortality. The treatment for pulmonary diseases continues to be hampered by the lack of reliable metrics to diagnose, as well as assess disease progression and therapeutic response. The current tools to diagnose and monitor pulmonary diseases are the pulmonary function tests (PFT) consisting of spirometry and plethysmography, and diffusing capacity of the lungs for carbon monoxide (DLCO). However, these metrics are effort-dependent, tend to have poor reproducibility, and measure lung as a whole, which allow subtle or regional diseases to be ‘hidden’. Alternatively, computed tomography (CT) is capable of characterizing lung structures in exquisite details, which is commonly applied in detecting the presence of both emphysema and pulmonary fibrosis. However, these structure details do not necessarily correlate well to how patients feel, the lung function, and the treatment effect. Thus, this information is much better assessed by characterizing the functions of the lung. Nuclear medicine, employing 133Xe ventilation and 99Tcm-macroaggregated albumin perfusion scan (ventilation/perfusion V/Q scan) can assess the inequality of airflow and blood flow in the lung. However, this V/Q scan evolves the usage of radioactive tracers and is limited by both poor temporal and spatial resolution. Thus, there has been considerable interest in developing methods that can comprehensively evaluate lung function non-invasively and can provide 3D resolution. Therefore, there has been considerable interest in developing methods that can evaluate lung function comprehensively, non-invasively, and 3-dimensionality.
In recent years, the introduction of hyperpolarized (HP) 129Xe magnetic resonance imaging (MRI) into clinical research has provided a robust and non-invasive 3D imaging technique, capable of both high-resolution imaging of pulmonary ventilation and gas exchange. Notably, gas exchange imaging is enabled by the solubility and unique frequency shifts of xenon in interstitial barrier tissues and capillary red blood cells (RBC). These features offer the potential for 129Xe MRI to be used, not only to evaluate lung obstruction, but also interstitial and vascular diseases. With the capability for both ventilation and gas exchange imaging, robust and reproducible strategies are essential for both visualizing and qualifying the resulting images. Before that, a standardized acquisition with a well-understood relationship between 129Xe dose and image quality needs to be established for efficient and cost-effective acquisitions. Moreover, we also seek to understand the origins of ventilation defects as well as alterations in barrier uptake and RBC transfer. Until such fundamental issues are addressed, it will not be possible to disseminate 129Xe MRI for multi-center clinical trials.
The objective of this work is to establish a robust and comprehensive 129Xe ventilation MRI clinical workflow to investigate pulmonary disorders, and to lay the foundation for clinical deployment and multi-center dissemination. To this end, this work describes several milestones toward establishing a routine, high signal-to-noise ratio (SNR) 129Xe ventilation MRI acquisition with the minimum sufficient volume 129Xe gas, and associated robust quantification pipeline for our clinical platforms. Moreover, we compared our quantification pipeline to other approaches in the field, as well as on different types of acquisition strategies (multi-slice GRE vs. 3D-radial).
To date, various quantification methods have been established for 129Xe ventilation MRI, yet no agreement has been reached on how to calculate the ventilation defect percentage (VDP). Thus, this work begins by developing a quantification workflow with semi-automatic delineation of the 1H thoracic cavity images, automatic pulmonary vasculature extraction, and inhomogeneity correction of the 129Xe ventilation images. It employs a robust linear binning classification that characterizes the entire ventilation distribution while being grounded in a healthy reference population. This quantification method can help evaluate, with high repeatability, how aging, diseases, and treatment influence ventilation distribution.
To further evaluate the robustness of this linear binning quantification method, its performance was assessed against another commonly used clustering method – K-means, on quantifying ventilation images. As part of the investigation, the methods were tested on images for which SNR had been artificially degraded. Through this evaluation, the minimum image SNR was established for an adequate quantification. We have also made the SNR-degraded image sets publicly available at Harvard Dataverse. These shared image sets could be used to evaluate the robustness of various quantification methods in the field. This endeavor is intended to help the pulmonary functional MRI community to standardize the analysis methods and laid the groundwork for future multi-center comparison studies.
We further address the fact that 129Xe ventilation MRI can be and has been conducted using a variety of pulse sequences, scan duration, and 129Xe doses. With more acceptance of the general utility of 129Xe MRI, imaging protocols must be standardized to enable multi-center trials. We thus sought to establish a rational basis for understanding the dose requirements and evaluating how different pulse sequences and 129Xe doses can influence 129Xe ventilation quantification. From that, the minimum required 129Xe dose for an adequate 129Xe ventilation quantification can be derived.
Maybe the emergence and development of 129Xe gas transfer MRI has introduced not only the ability to regionally assess gas exchange, but has introduced the interesting problem that it also delivers ventilation data from the same breath. However, the gas phase is acquired differently, with low resolution and isotropically. This raises the question as to how to generalize the ventilation quantification approach previously introduced specifically for multi-slice GRE. Therefore, we sought to generalize the linear binning approach for rescaling the intensity histogram, which enables the application of linear binning analysis to any ventilation MRI acquisition. We also investigated whether, and to what extent, 3D-radial acquisition can provide similar diagnostic information as from a dedicated multi-slice GRE acquisition. Through these efforts, we evaluated the possibility to employ a more efficient scan protocol for future routine clinical application.
During the course of this work, several practical engineering challenges were raised. First, hyperpolarized MRI has so far mostly been demonstrated at 1.5 Tesla (T), while most MRI vendors are transitioning multi-nuclear platforms to 3 T. This transition from 1.5 T to 3 T requires a reconsideration of optimal imaging acquisition and further optimization of quantification method. Moreover, preparation for multi-center dissemination points to the need for future centralized processing. This leads to the interest in cloud-based processing. However, in order to make this possible, manual segmentation of the thoracic cavity must be replaced by automatic methods. This, in turn requires the use of a novel neural network-based approaches. To this end, we first optimized the sequence on the transition to our new 3 T system. After completing the transition, the linear binning quantification method was further optimized with an enhanced vasculature segmentation and a neural network based 1H thoracic cavity segmentation. We also exploited the emergence of RBC transfer and implemented a framework to interpret these images by comparing them to more well-established approaches such as Gd-enhanced dynamic contrast-enhanced (DCE) perfusion MRI. To this end, we also developed a quantitative perfusion imaging pipeline that could be used to interpret the causes of RBC defects in our gas exchange imaging.
Taken together, results presented in this dissertation provide the step by step development of our rapid clinical exam workflow for hyperpolarized 129Xe MRI. This clinical workflow, not only demonstrates a comprehensive image quantification pipeline with applications to the 129Xe ventilation images and Gd-enhanced DCE MRI, but also the considerations for the acquisition sequence and delivered 129Xe dose. Overall, the established quantification pipeline offers a robust and sensitive way for diseases phenotyping, disease monitoring, and treatment planning. Moreover, this thesis work has hopefully laid the groundwork for standardized quantification, that could be deployed for future multi-center clinical trials.
Item Open Access Measuring robustness of brain networks in autism spectrum disorder with Ricci curvature.(Scientific reports, 2020-07-02) Simhal, Anish K; Carpenter, Kimberly LH; Nadeem, Saad; Kurtzberg, Joanne; Song, Allen; Tannenbaum, Allen; Sapiro, Guillermo; Dawson, GeraldineOllivier-Ricci curvature is a method for measuring the robustness of connections in a network. In this work, we use curvature to measure changes in robustness of brain networks in children with autism spectrum disorder (ASD). In an open label clinical trials, participants with ASD were administered a single infusion of autologous umbilical cord blood and, as part of their clinical outcome measures, were imaged with diffusion MRI before and after the infusion. By using Ricci curvature to measure changes in robustness, we quantified both local and global changes in the brain networks and their potential relationship with the infusion. Our results find changes in the curvature of the connections between regions associated with ASD that were not detected via traditional brain network analysis.Item Open Access Minimax Fairness in Machine Learning(2022) Martinez Gil, Natalia LucienneThe notion of fairness in machine learning has gained significant popularity in the last decades, in part due to the large number of decision-making models that are being deployed on real-world applications, which have presented unwanted behavior. In this work, we analyze fairness in machine learning from a multi-objective optimization perspective, where the goal is to learn a model that achieves a good performance across different groups or demographics. In particular, we analyze how to achieve models that are efficient in the Pareto sense, providing the best performance for the worst group (i.e., minimax solutions). We study how to achieve minimax Pareto fair solutions when sensitive groups are available at training time, and also when the demographics are completely unknown. We provide experimental results showing how the discussed techniques to achieve minimax Pareto fair solutions perform on classification tasks, and how they can be adapted to work on other applications such as backward compatibility and federated learning. Finally, we analyze the problem of achieving minimax solutions asymptotically when we optimize models that can perfectly fit their training data, such as deep neural networks trained with stochastic gradient descent.
Item Open Access Robustness and Generalization Under Distribution Shifts(2022) Bertran Lopez, Martin AndresMachine learning algorithms are applied in a wide variety of fields such as finance, healthcare, and entertainment. The objectives of these machine learning algorithms are varied, with two of the most common use cases being inference of a target variable from observations, and sequential decision-making to maximize a reward in the reinforcement learning setting. Regardless of the objective, it is common for machine learning algorithms to be trained on a finite dataset where each sample is collected independently from some data distribution emulating the real world, or, in the case of reinforcement learning, over a finite set of interactions with an environment simulating real world interactions.
One major concern is how to characterize the generalization of these objectives outside of their training data, measured as the discrepancy between performance on the training dataset or environment, and performance in the real world. This is exacerbated by the fact that many applications suffer from distribution shift; a phenomenon where there is a mismatch between the training distribution and the real world environment. Algorithms that are not robust to distribution shifts are liable to present unintended behaviours during deployment. In this work, we develop tools to minimize the risks posed by distribution shifts in a variety of settings. In the first part of this work, we propose and analyze techniques to deal with distribution shifts the supervised learning setting, making the model's decision either independent or robust to certain factors in the input distribution, and show the efficacy of these techniques in dealing with distribution shift. We later examine the setting of sequential decision making, where we discuss how to reinterpret the reinforcement learning scenario in a way that allows generalization bounds from standard supervised learning to be applied to reinforcement learning. We then analyze how to learn representations that are invariant to task-irrelevant distribution, and demonstrate how this can improve performance in the presence of distribution shifts.
Item Open Access Stop memorizing: A data-dependent regularization framework for intrinsic pattern learningZhu, Wei; Qiu, Qiang; Wang, Bao; Lu, Jianfeng; Sapiro, Guillermo; Daubechies, IngridDeep neural networks (DNNs) typically have enough capacity to fit random data by brute force even when conventional data-dependent regularizations focusing on the geometry of the features are imposed. We find out that the reason for this is the inconsistency between the enforced geometry and the standard softmax cross entropy loss. To resolve this, we propose a new framework for data-dependent DNN regularization, the Geometrically-Regularized-Self-Validating neural Networks (GRSVNet). During training, the geometry enforced on one batch of features is simultaneously validated on a separate batch using a validation loss consistent with the geometry. We study a particular case of GRSVNet, the Orthogonal-Low-rank Embedding (OLE)-GRSVNet, which is capable of producing highly discriminative features residing in orthogonal low-rank subspaces. Numerical experiments show that OLE-GRSVNet outperforms DNNs with conventional regularization when trained on real data. More importantly, unlike conventional DNNs, OLE-GRSVNet refuses to memorize random data or random labels, suggesting it only learns intrinsic patterns by reducing the memorizing capacity of the baseline DNN.Item Open Access Uncovering the Connectome(2019) Simhal, Anish KumarOver the past two decades, there has been an explosion in the number of tools and technology available to neuroscientists. With the advent on array tomography (AT) in the last decade, our ability to study synapses and their proteometric composition in the mammalian cortex has skyrocketed. However, unlike electron microscopy (EM) data which is the gold standard for synapse detection, AT data presents a variety of challenges in visualizing and characterizing synapses. There are many sources of noise, no singular definition of a synapse, and no standardized approach for data processing. In this work, our goal is to study synapse anatomy by combining array tomography with novel image processing methods. First, we started by creating a probabilistic synapse detector, which detects synapses based on their proteometric subtype with no training data. Then, we created a tool to characterize the efficacy of antibodies for array tomography applications. We end by expanding the probabilistic synapse detection method for tripartite synapses and explore the differences in synapses between wild-type and FMR1 knockout mice. This analysis lead to the discovery of several new effects of the FMR1 gene on astrocytic synapse density including the observation that there is a significant decrease in the density of excitatory glutamatergic synapses and their association with astrocytes while the changes in inhibitory GABAergic synapses are less pronounced. Our results suggest that that in Fragile X Syndrome astrocytes may mediate at least some of the pathological effects on glutamatergic synapses, while GABAergic synapses are likely influenced by a different mechanism.
Item Open Access Use of Machine Learning and Computer Vision Methods for Building Behavioral and Electrophysiological Biomarkers for Brain Disorders(2023) Isaev, DmitryResearch on biomarkers of brain disorders is an actively developing area. Biomarkers may allow for the early detection of diseases, which is essential for early intervention and improved outcomes. Biomarkers for monitoring the changes in the patient’s state can potentially increase the efficiency of clinical trials. Digital biomarkers, which emerged in recent years, rely on applications of machine learning methods to the data gathered by low-cost sensors, often embedded in consumer devices. Digital biomarkers have the potential to provide low-cost and more objective, granular, and sensitive to change metrics than traditional clinical ratings used in assessments of neurological and neurodevelopmental disorders. On the other hand, in traditional electrophysiological methods measuring brain activity, such as electroencephalography (EEG), biomarkers historically were based on visual analysis by clinicians, classical signal processing measures, or event-related potential (ERP) technique. Search for machine learning-based EEG biomarkers is an active area of research. This dissertation aims to build novel digital behavioral and EEG-based biomarkers and outcome measures by applying machine learning to behavioral, EEG, and concurrently recorded behavioral and EEG data. Machine learning models for the detection of gaze, human face and body landmarks, and automatic speech recognition achieve good performance on publicly available datasets. However, applying these models to a new clinical dataset immediately incurs a dataset shift problem, since the conditions under which real clinical video and audio data are recorded differe from the training dataset (e.g. different video camera angles, or audio noise). Furthermore, clinical datasets are in general much smaller than those used for training such models, and there are not enough human resources in the clinical setting to perform data labeling, making re-training not feasible. Yet, the question remains – whether the predictions from pre-trained models can provide valuable insight into human behavior and neurophysiology in the clinical setting, and whether they can be a source of clinically relevant findings. In this dissertation, we first explore this question in two use cases: (1) building digital measures of caregiver-child interaction in neurodevelopmental disorders using pre-trained pose detection deep learning models; (2) creating a digital biomarker of ataxic dysarthria using pre-trained automatic speech recognition deep learning models. We show that in the first case, our method enables to distinguish different clusters of caregiver responsiveness which are associated with a child’s caregiver- and clinician-reported socialization, communication, and language abilities, thus demonstrating the feasibility of using digital measures of caregiver-child interaction in clinical trials. In the second case, we demonstrate the convergent validity of our novel biomarker with clinician-reported scores and the greater sensitivity to change than clinician-reported scores on a longitudinal dataset. Second, we propose a novel deep learning model for detecting seizures in neonates from EEG data. We demonstrate the model’s high generalizability by evaluating it on an independent dataset from another hospital and show that model by design can be applied in different facilities with different EEG hardware. This approach has the potential to be clinically validated and will allow to scale up studies of neonatal seizures by increasing the sample sizes (including data from multiple clinical centers). Finally, we turn to the problem of combining EEG and behavioral biomarkers, which can improve biomarker sensitivity, but also provide new insights into brain-behavior relationships. In the study of autism, we propose a new metric of attentional preference to social/non-social stimuli and show that not only it distinguishes between autistic and neurotypical children, but also is differently associated with brain activity as measured by EEG. Then we turn to the question of scaling up EEG and behavior studies and provide the tool that allows measuring participants’ attention to the screen during EEG recording. This tool will allow to reduce human effort and make measurements of participants’ visual attention more objective, thus scaling up data preprocessing and allowing for multi-center studies of concurrent EEG and behavior.