Understanding Dimension Reduction for Data Visualization
| dc.contributor.advisor | Rudin, Cynthia | |
| dc.contributor.author | Wang, Yingfan | |
| dc.date.accessioned | 2025-07-02T19:02:41Z | |
| dc.date.available | 2025-07-02T19:02:41Z | |
| dc.date.issued | 2024 | |
| dc.department | Computer Science | |
| dc.description.abstract | Dimension Reduction (DR) algorithms have emerged as critical tools that allow scientists to gain insight into high-dimensional data. DR algorithms map high-dimensional data to a low dimensional embedding, enabling data visualization. A high-quality visualization can help the user to gain insights about cluster structure and distributional characteristics of the data. On the other hand, a low-quality DR visualization can create the appearance of structure in the data that does not actually exist. Without an understanding of the algorithms’ loss functions and what aspects of them have an impact on the embedding, it is difficult to substantially improve upon them. In addition, given the importance of gaining insights from DR, DR methods should be evaluated carefully before trusting their results. My research presents frameworks to (1) obtain insights of how dimension reduction tools work, including understanding how the choices of loss functions and what graph components to include affect the final embedding of dimension reduction algorithms (Chapter 2); (2) systematically evaluate popular DR methods, including t-SNE, art-SNE, UMAP, PaCMAP, TriMap and ForceAtlas2, which can help us to choose DR tools that align with the scientific goals of the user (Chapter 3); and (3) three variants of PaCMAP that focus on addressing different aspects of dimension reduction, including BridgeMAP, LocalMAP and ParamRepulsor (Chapter 4, 5 and 6). | |
| dc.identifier.uri | ||
| dc.rights.uri | ||
| dc.subject | Computer science | |
| dc.title | Understanding Dimension Reduction for Data Visualization | |
| dc.type | Dissertation | |
| duke.embargo.months | .25 | |
| duke.embargo.release | 2025-07-13 |