Browsing by Subject "Topological data analysis"
- Results Per Page
- Sort Options
Item Open Access Algebraic Data Structure for Decomposing Multipersistence Modules(2020-11-12) Li, JoeySingle-parameter persistent homology techniques in topological data analysis have seen increasing usage in recent years. These techniques have found particular success because of the existence of a complete, discrete, efficiently computable invariant to describe persistence modules in the single-parameter case: the barcode. Attempts to develop an equally robust theory of multiparameter persistent homology, however, have been slow to progress because there is no natural multiparameter analogue to the barcode. Relatively little is known about the structure of decompositions of multiparameter persistence (multipersistence) modules or how to classify their indecomposables. In fact, even for the problem of computing decompositions, there currently is no generalization to multiple parameters of the decomposition algorithm from single-parameter persistent homology. In this paper, we define a new algebraic data structure, the QR code, which was first proposed in https://arxiv.org/abs/1709.08155 but was formulated somewhat erroneously. Additionally, we prove a theorem stating that the QR code recovers all the information of the module it encodes. We suggest that this new data structure, which seeks to encode a module using births and deaths rather than births and relations, may be the correct language in which to solve the problem of decomposing arbitrary finitely generated multipersistence modules.Item Open Access Applications of Topological Data Analysis and Sliding Window Embeddings for Learning on Novel Features of Time-Varying Dynamical Systems(2017) Ghadyali, Hamza MustafaThis work introduces geometric and topological data analysis (TDA) tools that can be used in conjunction with sliding window transformations, also known as delay-embeddings, for discovering structure in time series and dynamical systems in an unsupervised or supervised learning framework. For signals of unknown period, we introduce an intuitive topological method to discover the period, and we demonstrate its use in synthetic examples and real temperature data. Alternatively, for almost-periodic signals of known period, we introduce a metric called Geometric Complexity of an Almost Periodic signal (GCAP), based on a topological construction, which allows us to continuously measure the evolving variation of its periods. We apply this method to temperature data collected from over 200 weather stations in the United States and describe the novel patterns that we observe. Next, we show how geometric and TDA tools can be used in a supervised learning framework. Seizure-detection using electroencephalogram (EEG) data is formulated as a binary classification problem. We define new collections of geometric and topological features of multi-channel data, which utilizes temporal and spatial context of EEG, and show how it results in better overall performance of seizure detection than using the usual time-domain and frequency domain features. Finally, we introduce a novel method to sonify persistence diagrams, and more generally any planar point cloud, using a modified version of the harmonic table. This auditory display can be useful for finding patterns that visual analysis alone may miss.
Item Open Access Geometric Multimedia Time Series(2017) Tralie, Christopher JohnThis thesis provides a new take on problems in multimedia times series analysis by using a shape-based perspective to quantify patterns in time, which is complementary to more traditional analysis-based time series techniques. Inspired by the dynamical systems community, we turn time series into shapes via sliding window embeddings, which we refer to as ``time-ordered point clouds'' (TOPCs). This framework has traditionally been used on a single 1D observation function for deterministic systems, but we generalize the sliding window technique so that it not only applies to multivariate data (e.g. videos), but that it also applies to data which is not stationary (e.g. music).
The geometry of our time-ordered point clouds can be quite informative. For periodic signals, the point clouds fill out topological loops, which, depending on harmonic content, reside on various high dimensional tori. For quasiperiodic signals, the point clouds are dense on a torus. We use modern tools from topological data analysis (TDA) to quantify degrees of periodicity and quasiperiodicity by looking at these shapes, and we show that this can be used to detect anomalies in videos of vibrating vocal folds. In the case of videos, this has the advantage of substantially reducing the amount of preprocessing, as no motion tracking is needed, and the technique operates on raw pixels. This is also one of the first known uses of persistent H2 in a high dimensional setting.
Periodic processes represent only a sliver of possible dynamics, and we also show that sequences of arbitrary normalized sliding window point clouds are approximately isometric between ``cover songs,'' or different versions of the same song, possibly with radically different spectral content. Surprisingly, in this application, an incredibly simple geometric descriptor based on self-similarity matrices performs the best, and it also enables us to use MFCC features for this task, which was previously thought not to be possible due to significant timbral differences that can exist between versions. When combined with traditional pitch-based features using similarity metric fusion, we obtain state of the art results on automatic cover song identification.
In addition to being used as a geometric descriptor, self-similarity matrices provide a unifying description of phenomena in time-ordered point clouds throughout our work, and we use them to illustrate properties such as recurrence, mirror symmetry in time, and harmonics in periodic processes. They also provide the base representation for designing isometry blind time warping algorithms, which we use to synchronize time-ordered point clouds that are shifted versions of each other in space without ever having to do a spatial alignment. In particular, we devise an algorithm that lower bounds the 1-stress between two time-ordered point clouds, which is related to the Gromov-Hausdorff distance.
Overall, we show a proof-of-concept and promise of the nascent field of geometric signal processing, which is worthy of further study in applications of music structure, multimodal data analysis, and video analysis.
Item Open Access Invariants and Metrics for Multiparameter Persistent Homology(2019) Thomas, AshleighThis dissertation is about building fundamental techniques for comparing data via a geometric and topological data analysis method called multiparameter persistent homology. The techniques used are largely algebraic. A new summary statistic, called the multirank function, is introduced as a measure of persistence output that detects relationships between important features of the data being analyzed. Also introduced is a technique for modifying existing metrics on the space of persistence outputs. Existing metrics can return infinite distances, which do not give as much information as a finite distance; the proposed modification gives fewer such situations. The final chapter of this dissertation details work in a long-term biology research project. Persistence is used to study the relationship between continuous morphological variation and rates of topologically abnormal morphologies in populations of fruit flies. Some preliminary computations showing proof of concept are included. Future plans involve using theoretical contributions from this dissertation for final analysis of the fly data.
The distance modification is joint work with Ezra Miller and the biology application is joint with Surabhi Beriwal, Ezra Miller, and biologists at the Houle Lab at Florida State University.
Item Open Access Math 412 - Topology with Applications(2016-06-24) Ghadyali, Hamza; Bendich, Paul LHighlights of Data Expedition: • Students explored daily observations of local climate data spanning the past 35 years. • Topological Data Analysis, or TDA for short, provides cutting-edge tools for studying the geometry of data in arbitrarily high dimensions. • Using TDA tools, students discovered intrinsic dynamical features of the data and learned how to quantify periodic phenomenon in a time-series. • Since nature invariably produces noisy data which rarely has exact periodicity, students also considered the theoretical basis of almost-periodicity and even invented and tested new mathematical definitions of almost-periodic functions. Summary The dataset we used for this data expedition comes from the Global Historical Climatology Network. “GHCN (Global Historical Climatology Network)-Daily is an integrated database of daily climate summaries from land surface stations across the globe.” Source: https://www.ncdc.noaa.gov/oa/climate/ghcn-daily/ We focused on the daily maximum and minimum temperatures from January 1, 1980 to April 1, 2015 collected from RDU International Airport. Through a guided series of exercises designed to be performed in Matlab, students explore these time-series, initially by direct visualization and basic statistical techniques. Then students are guided through a special sliding-window construction which transforms a time-series into a high-dimensional geometric curve. These high-dimensional curves can be visualized by projecting down to lower dimensions as in the figure below (Figure 1), however, our focus here was to use persistent homology to directly study the high-dimensional embedding. The shape of these curves has meaningful information but how one describes the “shape” of data depends on which scale the data is being considered. However, choosing the appropriate scale is rarely an obvious choice. Persistent homology overcomes this obstacle by allowing us to quantitatively study geometric features of the data across multiple-scales. Through this data expedition, students are introduced to numerically computing persistent homology using the rips collapse algorithm and interpreting the results. In the specific context of sliding-window constructions, 1-dimensional persistent homology can reveal the nature of periodic structure in the original data. I created a special technique to study how these high-dimensional sliding-window curves form loops in order to quantify the periodicity. Students are guided through this construction and learn how to visualize and interpret this information. Climate data is extremely complex (as anyone who has suffered from a bad weather prediction can attest) and numerous variables play a role in determining our daily weather and temperatures. This complexity coupled with imperfections of measuring devices results in very noisy data. This causes the annual seasonal periodicity to be far from exact. To this end, I have students explore existing theoretical notions of almost-periodicity and test it on the data. They find that some existing definitions are also inadequate in this context. Hence I challenged them to invent new mathematics by proposing and testing their own definition. These students rose to the challenge and suggested a number of creative definitions. While autocorrelation and spectral methods based on Fourier analysis are often used to explore periodicity, the construction here provides an alternative paradigm to quantify periodic structure in almost-periodic signals using tools from topological data analysis.Item Open Access Statistical analysis of fruit fly wing vein topology(2018-04) Beriwal, SurabhiThe fruit fly Drosophila melanogaster is a commonly used model organism for evolution given that the species showcases interesting behaviors and is easy to modify and rear. Among other things, the Drosophila wings are studied because their structure is tractable, consistent, and traceable developmentally. Along with Dr. Ezra Miller and Ashleigh Thomas, I studied evolutionary changes to Drosophila melanogaster wings using persistent homology. The biological hypothesis posits that selecting for continuous wing deformation leads to higher rates of topological novelty. We are interested in understanding whether selection on a continuous trait can itself cause higher rates of variation of a (separate) discrete trait. We work joint with Dr. David Houle at Florida State University.