Estimating the Intrinsic Dimension of High-Dimensional Data Sets: A Multiscale, Geometric Approach

Loading...
Thumbnail Image

Date

2011

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

1085
views
1062
downloads

Abstract

This work deals with the problem of estimating the intrinsic dimension of noisy, high-dimensional point clouds. A general class of sets which are locally well-approximated by k dimensional planes but which are embedded in a D>>k dimensional Euclidean space are considered. Assuming one has samples from such a set, possibly corrupted by high-dimensional noise, if the data is linear the dimension can be recovered using PCA. However, when the data is non-linear, PCA fails, overestimating the intrinsic dimension. A multiscale version of PCA is thus introduced which is robust to small sample size, noise, and non-linearities in the data.

Department

Description

Provenance

Citation

Citation

Little, Anna Victoria (2011). Estimating the Intrinsic Dimension of High-Dimensional Data Sets: A Multiscale, Geometric Approach. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/3863.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.