Understanding Dimension Reduction for Data Visualization

Loading...

Date

2024

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

3
views
0
downloads

Abstract

Dimension Reduction (DR) algorithms have emerged as critical tools that allow scientists to gain insight into high-dimensional data. DR algorithms map high-dimensional data to a low dimensional embedding, enabling data visualization. A high-quality visualization can help the user to gain insights about cluster structure and distributional characteristics of the data. On the other hand, a low-quality DR visualization can create the appearance of structure in the data that does not actually exist. Without an understanding of the algorithms’ loss functions and what aspects of them have an impact on the embedding, it is difficult to substantially improve upon them. In addition, given the importance of gaining insights from DR, DR methods should be evaluated carefully before trusting their results.

My research presents frameworks to (1) obtain insights of how dimension reduction tools work, including understanding how the choices of loss functions and what graph components to include affect the final embedding of dimension reduction algorithms (Chapter 2); (2) systematically evaluate popular DR methods, including t-SNE, art-SNE, UMAP, PaCMAP, TriMap and ForceAtlas2, which can help us to choose DR tools that align with the scientific goals of the user (Chapter 3); and (3) three variants of PaCMAP that focus on addressing different aspects of dimension reduction, including BridgeMAP, LocalMAP and ParamRepulsor (Chapter 4, 5 and 6).

Description

Provenance

Subjects

Computer science

Citation

Citation

Wang, Yingfan (2024). Understanding Dimension Reduction for Data Visualization. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32584.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.