Use of Machine Learning and Computer Vision Methods for Building Behavioral and Electrophysiological Biomarkers for Brain Disorders

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



Research on biomarkers of brain disorders is an actively developing area. Biomarkers may allow for the early detection of diseases, which is essential for early intervention and improved outcomes. Biomarkers for monitoring the changes in the patient’s state can potentially increase the efficiency of clinical trials. Digital biomarkers, which emerged in recent years, rely on applications of machine learning methods to the data gathered by low-cost sensors, often embedded in consumer devices. Digital biomarkers have the potential to provide low-cost and more objective, granular, and sensitive to change metrics than traditional clinical ratings used in assessments of neurological and neurodevelopmental disorders. On the other hand, in traditional electrophysiological methods measuring brain activity, such as electroencephalography (EEG), biomarkers historically were based on visual analysis by clinicians, classical signal processing measures, or event-related potential (ERP) technique. Search for machine learning-based EEG biomarkers is an active area of research. This dissertation aims to build novel digital behavioral and EEG-based biomarkers and outcome measures by applying machine learning to behavioral, EEG, and concurrently recorded behavioral and EEG data. Machine learning models for the detection of gaze, human face and body landmarks, and automatic speech recognition achieve good performance on publicly available datasets. However, applying these models to a new clinical dataset immediately incurs a dataset shift problem, since the conditions under which real clinical video and audio data are recorded differe from the training dataset (e.g. different video camera angles, or audio noise). Furthermore, clinical datasets are in general much smaller than those used for training such models, and there are not enough human resources in the clinical setting to perform data labeling, making re-training not feasible. Yet, the question remains – whether the predictions from pre-trained models can provide valuable insight into human behavior and neurophysiology in the clinical setting, and whether they can be a source of clinically relevant findings. In this dissertation, we first explore this question in two use cases: (1) building digital measures of caregiver-child interaction in neurodevelopmental disorders using pre-trained pose detection deep learning models; (2) creating a digital biomarker of ataxic dysarthria using pre-trained automatic speech recognition deep learning models. We show that in the first case, our method enables to distinguish different clusters of caregiver responsiveness which are associated with a child’s caregiver- and clinician-reported socialization, communication, and language abilities, thus demonstrating the feasibility of using digital measures of caregiver-child interaction in clinical trials. In the second case, we demonstrate the convergent validity of our novel biomarker with clinician-reported scores and the greater sensitivity to change than clinician-reported scores on a longitudinal dataset. Second, we propose a novel deep learning model for detecting seizures in neonates from EEG data. We demonstrate the model’s high generalizability by evaluating it on an independent dataset from another hospital and show that model by design can be applied in different facilities with different EEG hardware. This approach has the potential to be clinically validated and will allow to scale up studies of neonatal seizures by increasing the sample sizes (including data from multiple clinical centers). Finally, we turn to the problem of combining EEG and behavioral biomarkers, which can improve biomarker sensitivity, but also provide new insights into brain-behavior relationships. In the study of autism, we propose a new metric of attentional preference to social/non-social stimuli and show that not only it distinguishes between autistic and neurotypical children, but also is differently associated with brain activity as measured by EEG. Then we turn to the question of scaling up EEG and behavior studies and provide the tool that allows measuring participants’ attention to the screen during EEG recording. This tool will allow to reduce human effort and make measurements of participants’ visual attention more objective, thus scaling up data preprocessing and allowing for multi-center studies of concurrent EEG and behavior.





Isaev, Dmitry (2023). Use of Machine Learning and Computer Vision Methods for Building Behavioral and Electrophysiological Biomarkers for Brain Disorders. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.