Browsing by Author "Iliopoulos, Alexandros Stavros"
Results Per Page
Sort Options
Item Open Access All-Near-Neighbor Search Among High-Dimensional Data via Hierarchical Sparse Graph Code Filtering(2020) Iliopoulos, Alexandros StavrosThis thesis addresses the problem of all-to-all near-neighbor (all-NN) search among a large dataset of discrete points in a high-dimensional feature space endowed with a distance metric. Near-neighbor search is fundamental to numerous computational tasks arising in modern data analysis, modeling, and knowledge discovery. The accuracy and efficiency of near-neighbor search depends on the underlying structure of the dataset. In emerging data-driven analysis applications, the dataset is not necessarily stationary, the structure is not necessarily built once for all queries, and search is not necessarily limited to a few query points. To facilitate accurate and efficient near-neighbor search in a stationary or dynamic dataset, we make a systematic investigation of all-NN search at multiple resolution levels, with attention to several standing or previously overlooked issues associated with high-dimensional feature spaces.
The key contributions are outlined as follows. (i) We proclaim the inherent sparse distribution of a finite discrete dataset in a high-dimensional space. We demonstrate that exploiting this sparsity is pivotal to dispelling the so-called curse of dimensionality in theoretical analysis and practical algorithm design. (ii) We introduce the adjacency graph hierarchy (AGH) as an analysis apparatus as well as base infrastructure for efficient all-NN search algorithm design. (iii) We present an efficient AGH construction algorithm by directly and deterministically pinpointing adjacent nodes at each resolution level via sparse adjacency code filtering, in lieu of exhaustive search within local node neighborhoods. We provide at the same time a statistical estimate of the significant gap in algorithm complexity between direct location of adjacent nodes and local search. (iv) We introduce an adjacency-sensitive, graph-based but decentralized approach for hierarchical partition of a discrete dataset, with potential advantage for accurate and efficient all-NN search in comparison to existing non-interacting partition structures.