Estimating the Intrinsic Dimension of High-Dimensional Data Sets: A Multiscale, Geometric Approach

Loading...
Thumbnail Image

Date

2011

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

1082
views
1030
downloads

Abstract

This work deals with the problem of estimating the intrinsic dimension of noisy, high-dimensional point clouds. A general class of sets which are locally well-approximated by k dimensional planes but which are embedded in a D>>k dimensional Euclidean space are considered. Assuming one has samples from such a set, possibly corrupted by high-dimensional noise, if the data is linear the dimension can be recovered using PCA. However, when the data is non-linear, PCA fails, overestimating the intrinsic dimension. A multiscale version of PCA is thus introduced which is robust to small sample size, noise, and non-linearities in the data.

Department

Description

Provenance

Citation

Citation

Little, Anna Victoria (2011). Estimating the Intrinsic Dimension of High-Dimensional Data Sets: A Multiscale, Geometric Approach. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/3863.

Collections


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.