Graphical and Isoperimetric Perspectives on Sampling and Regularization

dc.contributor.advisorDunson, David
dc.contributor.authorTam, Edric
dc.date.accessioned2025-01-08T17:44:56Z
dc.date.available2025-01-08T17:44:56Z
dc.date.issued2024
dc.departmentStatistical Science
dc.description.abstract<p>Several recurring themes in modern Bayesian statistics and probabilistic machine learning are sampling, regularization and embeddings/latent variables. This thesis examines particular aspects of these themes from the perspective of concentration of measure, isoperimetry and spectral graph theory. These mathematical topics are intricately connected, and we often leverage related tools, such as Laplacians and concentration inequalities, in the theory and methods developed in this thesis. </p><p>From the sampling methodology end, we develop a novel spanning tree sampler called the fast-forwarded cover algorithm that improves upon the celebrated Aldous-Broder algorithm. The Aldous-Broder algorithm, and other related random-walk based approaches for spanning tree sampling, can be stuck when the underlying graph exhibits bottleneck structures. We show that our fast-forwarded cover algorithm can break free of bottlenecks that are arbitrarily small, while maintaining exact sampling guarantees. We demonstrate how our novel sampler can be applied to posterior sampling from Bayesian models with tree components. </p><p>For sampling theory, we investigate the statistical capacity of broad classes of deep generative models. We use concentration of measure techniques to show that there is a limitation to what kinds of distributions deep generative models can learn to sample from. In particular, for generative adversarial networks and variational autoencoders, under standard priors on the latent variables, only light tailed distributions can be learned. This fact is generalized to the manifold setting using tools from concentration and geometry. We also provide extra quantification of these models' capacities based on functional inequalities. This debunks a popular belief about the learning capacity of deep generative models. </p><p>On the regularization side, we propose Fiedler regularization, a way to use a neural network's underlying graph structure to regularize itself. It can be used in the supervised learning setting, as well as more general settings such as for regularizing autoencoders. We give theoretial guarantees and empirical experimental results. </p><p>Last but not least, on the embeddings front, we propose a novel piece of methodology for graph comparison called embedded Laplacian distance. Typical methods for comparing graphs are computationally demanding, due to the discrete and combinatorial nature of graphs. We observe that for many applications where graph comparison is needed, an approximate comparison that captures the structures of the graphs suffice. We propose to compare graphs based on continuous representations that capture meaningful graph structure, namely spectral embeddings. Theoretical and empirical results are provided. </p>
dc.identifier.urihttps://hdl.handle.net/10161/31963
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectStatistics
dc.titleGraphical and Isoperimetric Perspectives on Sampling and Regularization
dc.typeDissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tam_duke_0066D_18181.pdf
Size:
4.23 MB
Format:
Adobe Portable Document Format

Collections