Towards Personalized, Privacy-preserving, and Explainable AI for Healthcare
Abstract
In recent decades, the promotion of electronic healthcare (eHealth) has been greatly influenced by advancements in Artificial Intelligence (AI), wearable devices, and the Internet of Things (IoT). As one of the pivotal eHealth services, remote health monitoring, involving data sensing, collection, preprocessing, analysis, diagnoses, and feedback/recommendations, plays a crucial role in early disease identification and prevention. Remote health monitoring minimizes expenses for traveling to hospitals and enables faster diagnoses compared to traditional physician-centered healthcare.
In most prior work, data is commonly transmitted to remote cloud servers for comprehensive analysis, leveraging their boundless storage capacity and on-demand computational resources. However, this practice entails significant time and energy expenses due to the requirement of transmitting all data to the cloud. While edge devices facilitate real-time, context-aware analysis with reduced transmission overhead, their computational, storage, and power resources are limited. It is thus crucial to investigate the interplay between the edge and the cloud to balance model performance and costs effectively. We present an edge-cloud hierarchical model under the IoT context to optimize the balance between classification performance and overhead in Chapter 2. We demonstrate this using stress monitoring as a case study.
Nonetheless, a universal model may not provide precise outcomes for an individual client because of differences in data distribution, known as data heterogeneity, among the clients. As a result, creating personalized models is a crucial area of research. Furthermore, sending data directly from local edge devices to the cloud can result in significant private information leakage. To address such privacy concerns, we suggest using a decentralized approach to circumvent data transmission from the local edge. We introduce a personalized stress monitoring model under the federated learning (FL) context utilizing latent space clustering in Chapter 3. In Chapter 4, we aim to tackle data heterogeneity within the client by leveraging multi-domain learning and graph convolutional networks. Additionally, we suggest a robust defense method against gradient inversion attacks on medical image data via latent space data perturbation and minimax optimization in Chapter 5.
We have summarized the significance and unresolved challenges in remote health monitoring. However, it’s important to note that this system is not a substitute for the traditional physician-centered healthcare system. Following diagnosis, patients are required to go to hospitals for necessary medical procedures, including additional lab tests, surgeries, and subsequent medication. This is the stage where Electronic Health Records (EHR) analysis using explainable AI becomes pivotal.
EHR data includes digitized copies of patient information, hospital/intensive care unit (ICU) admissions, lab measurements, and clinical notes. However, due to the unpredictable patient visit patterns to hospitals, EHR may not reveal the actual disease progression timeline, i.e., many of the data points on the disease progression timeline are unobservable, and records between patients are misaligned, introducing data bias. In Chapter 6, we devise an EHR registration approach to synchronize patient records with a population-wide disease progression template. However, establishing a universal disease progression model for every patient is clinically impractical. Consequently, we explore customized modeling with various sub-types of progression curves, and this leads to our final research described in Chapter 7, which focuses on disease sub-phenotyping through semi-supervised soft clustering of sepsis patients.
This dissertation presents methods for developing personalized and privacy-preserving health monitoring solutions with low overhead. Also, it highlights the emerging open challenges and potential approaches to mitigate the problems in clinical research using EHR data. The proposed methods provide insights for building advanced solutions for precision medicine.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Jiang, Shiyi (2024). Towards Personalized, Privacy-preserving, and Explainable AI for Healthcare. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32631.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.