Overcoming Data Constraints in Applying Deep Learning to Healthcare
Date
2024
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Abstract
Applying deep learning (DL) models in healthcare significantly enhances diagnosis speeds, decision-making, early detection of chronic diseases, and customization of treatments for individual patients. However, a major challenge in integrating DL into healthcare is the persistent shortage of labeled data. This issue stems from the limited number of available patients, the labor-intensive and costly process of collecting annotations, and data privacy laws that restrict the sharing of labeled data between institutions.
My doctoral research focuses on applying DL to real-world healthcare scenarios and addressing the challenges associated with label insufficiency and data sharing restrictions. This dissertation will detail my efforts in developing a DL framework tailored for healthcare applications, introducing innovative algorithms to enhance label collection efficiency, and proposing solutions to maximize dataset utility despite constraints around data sharing.
My initial project created a DL framework for detecting skin lesions locations and predicting malignancy from dermatological images captured via smartphones or dermoscopy. We customized a two-stage framework that adapted traditional DL methods for object detection and image classification to suit our medical dataset, characterized by images with noisy and complex backgrounds. My second project focuses on active learning (AL) to improve the efficiency of DL models in certain scenarios, such as healthcare, where annotations are time-consuming and costly. We introduced a robust AL algorithm that incorporates a novel AL objective using influence functions. Experimental results show that our approach consistently outperforms both random selection and existing AL methods across various practical settings. My latest research tackles healthcare scenarios where data access differs during training and inference stages due to strict privacy regulations. We use contrastive learning to create pretrained encoders for data from various sources. These encoders produce embeddings that enhance information sharing across datasets, thereby improving the performance of downstream tasks under restricted data access conditions.
Overall, my doctoral research has advanced the application of DL in healthcare, enhancing models adaptability and performance through customized and innovative frameworks.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Xia, Meng (2024). Overcoming Data Constraints in Applying Deep Learning to Healthcare. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32573.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.