Browsing by Subject "manifold learning"
Results Per Page
Sort Options
Item Open Access A Geometric Approach to Biomedical Time Series Analysis(2020) Malik, JohnBiomedical time series are non-invasive windows through which we may observe human systems. Although a vast amount of information is hidden in the medical field's growing collection of long-term, high-resolution, and multi-modal biomedical time series, effective algorithms for extracting that information have not yet been developed. We are particularly interested in the physiological dynamics of a human system, namely the changes in state that the system experiences over time (which may be intrinsic or extrinsic in origin). We introduce a mathematical model for a particular class of biomedical time series, called the wave-shape oscillatory model, which quantifies the sense in which dynamics are hidden in those time series. There are two key ideas behind the new model. First, instead of viewing a biomedical time series as a sequence of measurements made at the sampling rate of the signal, we can often view it as a sequence of cycles occurring at irregularly-sampled time points. Second, the "shape" of an individual cycle is assumed to have a one-to-one correspondence with the state of the system being monitored; as such, changes in system state (dynamics) can be inferred by tracking changes in cycle shape. Since physiological dynamics are not random but are well-regulated (except in the most pathological of cases), we can assume that all of the system's states lie on a low-dimensional, abstract Riemannian manifold called the phase manifold. When we model the correspondence between the hidden system states and the observed cycle shapes using a diffeomorphism, we allow the topology of the phase manifold to be recovered by methods belonging to the field of unsupervised manifold learning. In particular, we prove that the physiological dynamics hidden in a time series adhering to the wave-shape oscillatory model can be well-recovered by applying the diffusion maps algorithm to the time series' set of oscillatory cycles. We provide several applications of the wave-shape oscillatory model and the associated algorithm for dynamics recovery, including unsupervised and supervised heartbeat classification, derived respiratory monitoring, intra-operative cardiovascular monitoring, supervised and unsupervised sleep stage classification, and f-wave extraction (a single-channel blind source separation problem).
Item Open Access Bayesian Computation for High-Dimensional Continuous & Sparse Count Data(2018) Wang, YeProbabilistic modeling of multidimensional data is a common problem in practice. When the data is continuous, one common approach is to suppose that the observed data are close to a lower-dimensional smooth manifold. There are a rich variety of manifold learning methods available, which allow mapping of data points to the manifold. However, there is a clear lack of probabilistic methods that allow learning of the manifold along with the generative distribution of the observed data. The best attempt is the Gaussian process latent variable model (GP-LVM), but identifiability issues lead to poor performance. We solve these issues by proposing a novel Coulomb repulsive process (Corp) for locations of points on the manifold, inspired by physical models of electrostatic interactions among particles. Combining this process with a GP prior for the mapping function yields a novel electrostatic GP (electroGP) process.
Another popular approach is to suppose that the observed data are closed to one or a union of lower-dimensional linear subspaces. However, popular methods such as probabilistic principal component analysis scale poorly computationally. We introduce a novel empirical Bayesian method that we term geometric density estimation (GEODE), which assumes the data is centered near a low-dimensional linear subspace. We show that, with mild assumptions on the prior, the subspace spanned by the principal axes of the data maximizes the posterior mode. Hence, leveraged on the geometric information of the data, GEODE easily scales to massive dimensional problems. It is also capable of learning the intrinsic dimension via a novel shrinkage prior. Finally we mix GEODE across a dyadic clustering tree to account for nonlinear cases.
When data is discrete, a common strategy is to define a generalized linear model (GLM) for each variable, with dependence in the different variables induced through including multivariate latent variables in the GLMs. The Bayesian inference for these models usually
rely on data augmented Markov chain Monte Carlo (DA-MCMC) method, which has a provable slow mixing rate when the data is imbalanced. For more scalable inference, we proposes Bayesian mosaic, a parallelizable composite posterior, for scalable Bayesian inference on a subclass of the multivariate discrete data models. Sampling is embarrassingly parallel since Bayesian mosaic is a multiplication of component posteriors that can be independently sampled from. Analogous to composite likelihood methods, these component posteriors are based on univariate or bivariate marginal densities. Utilizing the fact that the score functions of these densities are unbiased, we have shown that Bayesian mosaic is consistent and asymptotically normal under mild conditions. Since the evaluation of univariate or bivariate marginal densities could be done via numerical integration, sampling from Bayesian mosaic completely bypasses the traditional data augmented Markov chain Monte Carlo (DA-MCMC) method. Moreover, we have shown that sampling from Bayesian mosaic also has better scalability to large sample size than DA-MCMC.
The performance of the proposed methods and models will be demonstrated via both simulation studies and real world applications.
Item Open Access Learning gradients on manifolds(BERNOULLI, 2010-02) Mukherjee, S; Wu, Q; Zhou, DX