Joint Data Modeling Using Variational Autoencoders

Kumar, Achint

Joint Data Modeling Using Variational Autoencoders

View / Download7.82 MB

Date

2022

Authors

Kumar, Achint

Advisors

Pearson, John Michael

Repository Usage Stats

57
views

230
downloads

Abstract

Nervous systems are macroscopic nonequilibrium physical systems that produce intricate behaviors that remain difficult to analyze, classify, and understand. In my thesis, I develop and analyze a statistical technique based on machine learning thatattempts to improve upon previous efforts to analyze behavioral data by being multimodal, which means combining information from different kinds of observables to provide a better insight than what any one observable can provide. Many modern experiments have simultaneously recorded data from multiple sources (e.g., audio, video, neural data). It is of great interest to learn the relationship between different data sources. Multimodal datasets present a challenge for latent variable models as they must learn to capture not only the variance present within each data type, but also the relationships among types. Typically, this is done by training a collection of unimodal experts, the outputs of which are aggregated in a shared latent space. Here, building on recent developments in identifiable variational autoencoders (VAEs), I propose a new joint analysis method, the product of identifiable sufficient experts (POISE-VAE), which posits a latent representation unique to each modality, with latent spaces interacting via an undirected graphical model. This model guarantees identifiability of the latent spaces without the need for additional covariates, and given a simple yet flexible class of approximate posteriors, can be trained by maximizing an evidence lower bound approximated by Gibbs sampling. I show comparable performance to existing methods on a variety of toy and benchmark datasets in generating realistic samples, with applications to the simultaneous modeling of brain calcium imaging data and behavior. Then, I use the VAE framework to investigate the vocalization of hearing and deaf mice during courtship. It is of great interest to figure out if auditory feedback affects the vocalization produced by hearing and deaf mice. I use the low dimensional representation of data learnt by VAE to compare vocalizations produced in the two cases. My statistical analysis based on maximum mean discrepancy(MMD) yields no statistical difference in vocalization produced by the two groups. I conclude with a discussion on possible extensions of the model.

Type

Dissertation

Department

Physics

Subjects

Physics, Neurosciences, Computer science, Variational autoencoder

Permalink

https://hdl.handle.net/10161/26867

Citation

Kumar, Achint (2022). Joint Data Modeling Using Variational Autoencoders. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/26867.

Collections

Dissertations

Full item page

Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.

Joint Data Modeling Using Variational Autoencoders

Date

Authors

Advisors

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

Abstract

Type

Department

Description

Provenance

Subjects

Citation

Permalink

Citation

Collections