Modeling Time Series and Sequences: Learning Representations and Making Predictions
The analysis of time series and sequences has been challenging in both statistics and machine learning community, because of their properties including high dimensionality, pattern dynamics, and irregular observations. In this thesis, novel methods are proposed to handle the difficulties mentioned above, thus enabling representation learning (dimension reduction and pattern extraction), and prediction making (classification and forecasting). This thesis consists of three main parts.
The first part analyzes multivariate time series, which is often non-stationary due to high levels of ambient noise and various interferences. We propose a nonlinear dimensionality reduction framework using diffusion maps on a learned statistical manifold, which gives rise to the construction of a low-dimensional representation of the high-dimensional non-stationary time series. We show that diffusion maps, with affinity kernels based on the Kullback-Leibler divergence between the local statistics of samples, allow for efficient approximation of pairwise geodesic distances. To construct the statistical manifold, we estimate time-evolving parametric distributions by designing a family of Bayesian generative models. The proposed framework can be applied to problems in which the time-evolving distributions (of temporally localized data), rather than the samples themselves, are driven by a low-dimensional underlying process. We provide efficient parameter estimation and dimensionality reduction methodology and apply it to two applications: music analysis and epileptic-seizure prediction.
The second part focuses on a time series classification task, where we want to leverage the temporal dynamic information in the classifier design. In many time series classification problems including fraud detection, a low false alarm rate is required; meanwhile, we enhance the positive detection rate. Therefore, we directly optimize the partial area under the curve (PAUC), which maximizes the accuracy in low false alarm rate regions. Latent variables are introduced to incorporate the temporal information, while maintaining a max-margin based method solvable. An optimization routine is proposed with its properties analyzed; the algorithm is designed as scalable to web-scale data. Simulation results demonstrate the effectiveness of optimizing the performance in the low false alarm rate regions.
The third part focuses on pattern extraction from correlated point process data, which consist of multiple correlated sequences observed at irregular times. The analysis of correlated point process data has wide applications, ranging from biomedical research to network analysis. We model such data as generated by a latent collection of continuous-time binary semi-Markov processes, corresponding to external events appearing and disappearing. A continuous-time modeling framework is more appropriate for multichannel point process data than a binning approach requiring time discretization, and we show connections between our model and recent ideas from the discrete-time literature. We describe an efficient MCMC algorithm for posterior inference, and apply our ideas to both synthetic data and a real-world biometrics application.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info