Joint Data Modeling Using Variational Autoencoders

dc.contributor.advisor

Pearson, John Michael

dc.contributor.author

Kumar, Achint

dc.date.accessioned

2023-03-28T21:43:07Z

dc.date.available

2023-03-28T21:43:07Z

dc.date.issued

2022

dc.department

Physics

dc.description.abstract

Nervous systems are macroscopic nonequilibrium physical systems that produce intricate behaviors that remain difficult to analyze, classify, and understand. In my thesis, I develop and analyze a statistical technique based on machine learning thatattempts to improve upon previous efforts to analyze behavioral data by being multimodal, which means combining information from different kinds of observables to provide a better insight than what any one observable can provide. Many modern experiments have simultaneously recorded data from multiple sources (e.g., audio, video, neural data). It is of great interest to learn the relationship between different data sources. Multimodal datasets present a challenge for latent variable models as they must learn to capture not only the variance present within each data type, but also the relationships among types. Typically, this is done by training a collection of unimodal experts, the outputs of which are aggregated in a shared latent space. Here, building on recent developments in identifiable variational autoencoders (VAEs), I propose a new joint analysis method, the product of identifiable sufficient experts (POISE-VAE), which posits a latent representation unique to each modality, with latent spaces interacting via an undirected graphical model. This model guarantees identifiability of the latent spaces without the need for additional covariates, and given a simple yet flexible class of approximate posteriors, can be trained by maximizing an evidence lower bound approximated by Gibbs sampling. I show comparable performance to existing methods on a variety of toy and benchmark datasets in generating realistic samples, with applications to the simultaneous modeling of brain calcium imaging data and behavior. Then, I use the VAE framework to investigate the vocalization of hearing and deaf mice during courtship. It is of great interest to figure out if auditory feedback affects the vocalization produced by hearing and deaf mice. I use the low dimensional representation of data learnt by VAE to compare vocalizations produced in the two cases. My statistical analysis based on maximum mean discrepancy(MMD) yields no statistical difference in vocalization produced by the two groups. I conclude with a discussion on possible extensions of the model.

dc.identifier.uri

https://hdl.handle.net/10161/26867

dc.subject

Physics

dc.subject

Neurosciences

dc.subject

Computer science

dc.subject

Variational autoencoder

dc.title

Joint Data Modeling Using Variational Autoencoders

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kumar_duke_0066D_17078.pdf
Size:
7.82 MB
Format:
Adobe Portable Document Format

Collections