Skip to main content
Duke University Libraries
DukeSpace Scholarship by Duke Authors
  • Login
  • Ask
  • Menu
  • Login
  • Ask a Librarian
  • Search & Find
  • Using the Library
  • Research Support
  • Course Support
  • Libraries
  • About
View Item 
  •   DukeSpace
  • Theses and Dissertations
  • Duke Dissertations
  • View Item
  •   DukeSpace
  • Theses and Dissertations
  • Duke Dissertations
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Modeling Time Series and Sequences: Learning Representations and Making Predictions

Thumbnail
View / Download
5.8 Mb
Date
2015
Author
Lian, Wenzhao
Advisor
Carin, Lawrence
Repository Usage Stats
611
views
5,738
downloads
Abstract

The analysis of time series and sequences has been challenging in both statistics and machine learning community, because of their properties including high dimensionality, pattern dynamics, and irregular observations. In this thesis, novel methods are proposed to handle the difficulties mentioned above, thus enabling representation learning (dimension reduction and pattern extraction), and prediction making (classification and forecasting). This thesis consists of three main parts.

The first part analyzes multivariate time series, which is often non-stationary due to high levels of ambient noise and various interferences. We propose a nonlinear dimensionality reduction framework using diffusion maps on a learned statistical manifold, which gives rise to the construction of a low-dimensional representation of the high-dimensional non-stationary time series. We show that diffusion maps, with affinity kernels based on the Kullback-Leibler divergence between the local statistics of samples, allow for efficient approximation of pairwise geodesic distances. To construct the statistical manifold, we estimate time-evolving parametric distributions by designing a family of Bayesian generative models. The proposed framework can be applied to problems in which the time-evolving distributions (of temporally localized data), rather than the samples themselves, are driven by a low-dimensional underlying process. We provide efficient parameter estimation and dimensionality reduction methodology and apply it to two applications: music analysis and epileptic-seizure prediction.

The second part focuses on a time series classification task, where we want to leverage the temporal dynamic information in the classifier design. In many time series classification problems including fraud detection, a low false alarm rate is required; meanwhile, we enhance the positive detection rate. Therefore, we directly optimize the partial area under the curve (PAUC), which maximizes the accuracy in low false alarm rate regions. Latent variables are introduced to incorporate the temporal information, while maintaining a max-margin based method solvable. An optimization routine is proposed with its properties analyzed; the algorithm is designed as scalable to web-scale data. Simulation results demonstrate the effectiveness of optimizing the performance in the low false alarm rate regions.

The third part focuses on pattern extraction from correlated point process data, which consist of multiple correlated sequences observed at irregular times. The analysis of correlated point process data has wide applications, ranging from biomedical research to network analysis. We model such data as generated by a latent collection of continuous-time binary semi-Markov processes, corresponding to external events appearing and disappearing. A continuous-time modeling framework is more appropriate for multichannel point process data than a binning approach requiring time discretization, and we show connections between our model and recent ideas from the discrete-time literature. We describe an efficient MCMC algorithm for posterior inference, and apply our ideas to both synthetic data and a real-world biometrics application.

Type
Dissertation
Department
Electrical and Computer Engineering
Subject
Statistics
Computer science
Electrical engineering
point process
prediction
probabilistic models
representation learning
time series
Permalink
https://hdl.handle.net/10161/11362
Citation
Lian, Wenzhao (2015). Modeling Time Series and Sequences: Learning Representations and Making Predictions. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/11362.
Collections
  • Duke Dissertations
More Info
Show full item record
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.

Rights for Collection: Duke Dissertations


Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info

Make Your Work Available Here

How to Deposit

Browse

All of DukeSpaceCommunities & CollectionsAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit DateThis CollectionAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit Date

My Account

LoginRegister

Statistics

View Usage Statistics
Duke University Libraries

Contact Us

411 Chapel Drive
Durham, NC 27708
(919) 660-5870
Perkins Library Service Desk

Digital Repositories at Duke

  • Report a problem with the repositories
  • About digital repositories at Duke
  • Accessibility Policy
  • Deaccession and DMCA Takedown Policy

TwitterFacebookYouTubeFlickrInstagramBlogs

Sign Up for Our Newsletter
  • Re-use & Attribution / Privacy
  • Harmful Language Statement
  • Support the Libraries
Duke University