Multiscale dictionary learning for estimating conditional distributions
Repository Usage Stats
Nonparametric estimation of the conditional distribution of a response given highdimensional features is a challenging problem. It is important to allow not only the mean but also the variance and shape of the response density to change flexibly with features, which are massive-dimensional. We propose a multiscale dictionary learning model, which expresses the conditional response density as a convex combination of dictionary densities, with the densities used and their weights dependent on the path through a tree decomposition of the feature space. A fast graph partitioning algorithm is applied to obtain the tree decomposition, with Bayesian methods then used to adaptively prune and average over different sub-trees in a soft probabilistic manner. The algorithm scales efficiently to approximately one million features. State of the art predictive performance is demonstrated for toy examples and two neuroscience applications including up to a million features.
More InfoShow full item record
Arts and Sciences Distinguished Professor of Statistical Science
Development of novel approaches for representing and analyzing complex data. A particular focus is on methods that incorporate geometric structure (both known and unknown) and on probabilistic approaches to characterize uncertainty. In addition, a big interest is in scalable algorithms and in developing approaches with provable guarantees.This fundamental work is directly motivated by applications in biomedical research, network data analysis, neuroscience, genomics, ecol