Towards Personalized, Privacy-preserving, and Explainable AI for Healthcare

dc.contributor.advisor

Chakrabarty, Krishnendu KC

dc.contributor.advisor

Zhang, Anru AZ

dc.contributor.author

Jiang, Shiyi

dc.date.accessioned

2025-07-02T19:02:58Z

dc.date.available

2025-07-02T19:02:58Z

dc.date.issued

2024

dc.department

Electrical and Computer Engineering

dc.description.abstract

In recent decades, the promotion of electronic healthcare (eHealth) has been greatly influenced by advancements in Artificial Intelligence (AI), wearable devices, and the Internet of Things (IoT). As one of the pivotal eHealth services, remote health monitoring, involving data sensing, collection, preprocessing, analysis, diagnoses, and feedback/recommendations, plays a crucial role in early disease identification and prevention. Remote health monitoring minimizes expenses for traveling to hospitals and enables faster diagnoses compared to traditional physician-centered healthcare.

In most prior work, data is commonly transmitted to remote cloud servers for comprehensive analysis, leveraging their boundless storage capacity and on-demand computational resources. However, this practice entails significant time and energy expenses due to the requirement of transmitting all data to the cloud. While edge devices facilitate real-time, context-aware analysis with reduced transmission overhead, their computational, storage, and power resources are limited. It is thus crucial to investigate the interplay between the edge and the cloud to balance model performance and costs effectively. We present an edge-cloud hierarchical model under the IoT context to optimize the balance between classification performance and overhead in Chapter 2. We demonstrate this using stress monitoring as a case study.

Nonetheless, a universal model may not provide precise outcomes for an individual client because of differences in data distribution, known as data heterogeneity, among the clients. As a result, creating personalized models is a crucial area of research. Furthermore, sending data directly from local edge devices to the cloud can result in significant private information leakage. To address such privacy concerns, we suggest using a decentralized approach to circumvent data transmission from the local edge. We introduce a personalized stress monitoring model under the federated learning (FL) context utilizing latent space clustering in Chapter 3. In Chapter 4, we aim to tackle data heterogeneity within the client by leveraging multi-domain learning and graph convolutional networks. Additionally, we suggest a robust defense method against gradient inversion attacks on medical image data via latent space data perturbation and minimax optimization in Chapter 5.

We have summarized the significance and unresolved challenges in remote health monitoring. However, it’s important to note that this system is not a substitute for the traditional physician-centered healthcare system. Following diagnosis, patients are required to go to hospitals for necessary medical procedures, including additional lab tests, surgeries, and subsequent medication. This is the stage where Electronic Health Records (EHR) analysis using explainable AI becomes pivotal.

EHR data includes digitized copies of patient information, hospital/intensive care unit (ICU) admissions, lab measurements, and clinical notes. However, due to the unpredictable patient visit patterns to hospitals, EHR may not reveal the actual disease progression timeline, i.e., many of the data points on the disease progression timeline are unobservable, and records between patients are misaligned, introducing data bias. In Chapter 6, we devise an EHR registration approach to synchronize patient records with a population-wide disease progression template. However, establishing a universal disease progression model for every patient is clinically impractical. Consequently, we explore customized modeling with various sub-types of progression curves, and this leads to our final research described in Chapter 7, which focuses on disease sub-phenotyping through semi-supervised soft clustering of sepsis patients.

This dissertation presents methods for developing personalized and privacy-preserving health monitoring solutions with low overhead. Also, it highlights the emerging open challenges and potential approaches to mitigate the problems in clinical research using EHR data. The proposed methods provide insights for building advanced solutions for precision medicine.

dc.identifier.uri

https://hdl.handle.net/10161/32631

dc.rights.uri

https://creativecommons.org/licenses/by-nc-nd/4.0/

dc.subject

Computer engineering

dc.title

Towards Personalized, Privacy-preserving, and Explainable AI for Healthcare

dc.type

Dissertation

duke.embargo.months

19

duke.embargo.release

2027-01-13

Files

Collections