Exploration and Application of Dimensionality Reduction and Clustering Techniques to Diabetes Patient Health Records
Abstract
This research examines various data dimensionality reduction techniques and clustering
methods. The goal was to apply these ideas to a test dataset and a healthcare dataset
to see how they practically work and what conclusions we could draw from their application.
Specifically, we hoped to identify similar clusters of diabetes patients and develop
hypotheses of risk for adverse events for further research into sub-populations of
diabetes patients. Upon further research and application, it became apparent that
the data dimensionality reduction and clustering methods are sensitive to the parameter
settings and must be fine-tuned carefully to be successful. Additionally, we saw several
statistically significant differences in outcomes for the clusters identified with
these data. We focused on coronary artery disease and kidney disease. Focusing on
these clusters, we found a high proportion of patients taking medications for heart
or kidney conditions Based on these findings, we were able to decide on future paths
building upon this research that could lead to more actionable conclusions.
Type
Honors thesisDepartment
Computer SciencePermalink
https://hdl.handle.net/10161/14589Citation
Gopinath, Sidharth (2017). Exploration and Application of Dimensionality Reduction and Clustering Techniques
to Diabetes Patient Health Records. Honors thesis, Duke University. Retrieved from https://hdl.handle.net/10161/14589.Collections
More Info
Show full item record
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Undergraduate Honors Theses and Student papers
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info