Durability Queries on Temporal Data

dc.contributor.advisor

Yang, Jun

dc.contributor.author

Gao, Junyang

dc.date.accessioned

2020-09-18T16:00:16Z

dc.date.available

2020-09-18T16:00:16Z

dc.date.issued

2020

dc.department

Computer Science

dc.description.abstract

Temporal data is ubiquitous in our everyday life, but tends to be noisy and often exhibits transient patterns. To make better decisions with data, we must avoid jumping to conclusions based on certain particular query results or observations. Instead, a useful perspective is to consider "durability", or, intuitively speaking, finding results that are robust and stand "the test of time". This thesis studies durability queries on temporal data that return durable results efficiently and effectively.

The focus of this thesis is two-fold: (1) design meaningful and practical notions of durability (and corresponding queries) on different types of temporal data, and (2) develop efficient techniques for durability query processing.

We first study sequence-based temporal datasets where each temporal object has a series of values indexed by time.

Durability queries ask for objects whose (snapshot) values were among the top $k$ for at least some fraction of the times during a given time interval; e.g., "from 2013 to 2016, United Airlines has the highest stock price among American-based airline companies for at least 80\% of the time."

Second, we consider instant-stamped temporal datasets where each data record is stamped by a time instant.

Here, durability queries look for records that stand out among nearby records (defined by a time window) and retain their supremacy for a long period of time; e.g. "On January 22, 2006, Kobe Bryant dropped 81 points against Toronto Raptors, a scoring record that since then has yet to be broken."

Finally, going beyond analyzing historical data, we investigate the notation of durability into the future, where durability needs to be predicted by performing stochastic simulation of temporal models.

For answering durability queries across these problem settings, we apply principled approaches to design fast, scalable algorithms and indexing methods.

Our solutions broadly combine geometric, statistical, and approximate query processing techniques to provide a meaningful balance between query efficiency and result quality, along with theoretical worst-case (or average-case) guarantees.

dc.identifier.uri

https://hdl.handle.net/10161/21471

dc.subject

Computer science

dc.subject

Approximate Query Processing

dc.subject

Durability Queries

dc.subject

Probabilistic Data

dc.subject

Temporal Data

dc.title

Durability Queries on Temporal Data

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gao_duke_0066D_15827.pdf
Size:
3.14 MB
Format:
Adobe Portable Document Format

Collections