Browsing by Subject "Transfer learning"
Results Per Page
Sort Options
Item Embargo Adaptive Planning in Changing Policies and Environments(2023) Sivakumar, Kavinayan PillaiarBeing able to adapt to different tasks is a staple of learning, as agents aim to generalize across different situations. Specifically, it is important for agents to adapt to the policies of other agents around them. In swarm settings, multi-agent sports settings, or other team-based environments, agents learning from one another can save time and reduce errors in performance. As a result, traditional transfer reinforcement learning proposes ways to decrease the time it takes for an agent to learn from an expert agent. However, the problem of transferring knowledge across agents that operate in different action spaces and are therefore heterogeneous poses new challenges. Mainly, it is difficult to translate between heterogeneous agents whose action spaces are not guaranteed to intersect.
We propose a transfer reinforcement learning algorithm between heterogeneous agents based on a subgoal trajectory mapping algorithm. We learn a mapping between expert and learner trajectories that are expressed through subgoals. We do so by training a recurrent neural network on trajectories in a training set. Then, given a new task, we input the expert's trajectory of subgoals into the trained model to predict the optimal trajectory of subgoals for the learner agent. We show that the learner agent is able to learn an optimal policy faster with this predicted trajectory of subgoals.
It is equally important for agents to adapt to the intentions of agents around them. To this end, we propose an inverse reinforcement learning algorithm to estimate the reward function of an agent as it updates its policy over time. Previous work in this field assume the reward function is approximated by a set of linear feature functions. Choosing an expressive enough set of feature functions can be challenging, and failure to do so can skew the learned reward function. Instead, we propose an algorithm to estimate the policy parameters of the agent as it learns, bundling adjacent trajectories together in a new form of behavior cloning we call bundle behavior cloning. Our complexity analysis shows that using bundle behavior cloning, we can attain a tighter bound on the difference between the distribution of the cloned policy and that of the true policy than the same bound achieved in standard behavior cloning. We show experiments where our method achieves the same overall reward using the estimated reward function as that learnt from the initial trajectories, as well as testing the feasibility of bundle behavior cloning with different neural network structures and empirically testing the effect of the bundle choice on performance.
Finally, due to the need for agents to adapt to environments that are prone to change due to damage or detection, we propose the design of a robotic sensing agent to detect damage. In such dangerous environments, it may be unsafe for human operators to manually take measurements. Current literature in structural health monitoring proposes sequential sensing algorithms to optimize the number of locations measurements need to be taken at before locating sources of damage. As a result, the robotic sensing agent we designed is mobile, semi-autonomous, and precise in measuring a location on the model structure we built. We detail the components of our robotic sensing agent, as well as show measurement data taken from our agent at two locations on the structure displaying little to no noise in the measurement.
Item Open Access Advances in Log-concave Sampling and Domain Adaptation(2023) Wu, KeruIn the vast field of machine learning, the study of distributions and the innovation of algorithms driving from them serve as foundation pursuits. This dissertation delves into two critical topics tied to distributions: log-concave sampling and domain adaptation. Our research encompasses both theoretical and methodological developments in these fields, supported by thorough numerical experiments.
Beginning with log-concave sampling, we establish the minimax mixing time of the Metropolis-adjusted Langevin algorithm (MALA) for sampling from a log-smooth and strongly log-concave distribution. Our proof is divided into two main components: an upper bound and a lower bound. First, for a d-dimensional log-concave density with condition number $\kappa$, we show that MALA with a warm start mixes in $\tilde O(\kappa\sqrt(d))$ iterations up to logarithmic factors. Second, we prove a spectral gap based mixing time lower bound for reversible MCMC algorithms on general state spaces. We apply this lower bound result to construct a hard distribution for which MALA requires at least $\tilde\Omega(\kappa\sqrt{d})$ steps to mix. The lower bound aligns with our upper bound in terms of condition number and dimension, thereby demonstrating the minimax mixing time of MALA for log-concave sampling.
Shifting focus to Domain Adaptation (DA), we tackle the statistical learning problem where the distribution of the source data used to train a model differs from that of the target data used to evaluate the model. Our work is grounded in assumption of the presence of conditionally invariant components (CICs) --- feature representations that are relevant for prediction and remain conditionally invariant across the source and target distributions. We demonstrate three prominent roles of CICs in providing target risk guarantees in DA. First, we propose a new algorithm based on CICs, importance-weighted conditional invariant penalty (IW-CIP), which has target risk guarantees beyond simple settings such as covariate shift and label shift. Second, we show that CICs help identify large discrepancies between source and target risks of other DA algorithms. Finally, we demonstrate that incorporating CICs into the domain invariant projection (DIP) algorithm can address its failure scenario caused by label-flipping features.
Through rigorous analysis, we advance insights into log-concave sampling and domain adaptation. Our exploration underscores the importance of probabilistic understanding in designing algorithms tailored to intricate data distributions encountered in machine learning.
Item Open Access Bayesian Learning with Dependency Structures via Latent Factors, Mixtures, and Copulas(2016) Han, ShaoboBayesian methods offer a flexible and convenient probabilistic learning framework to extract interpretable knowledge from complex and structured data. Such methods can characterize dependencies among multiple levels of hidden variables and share statistical strength across heterogeneous sources. In the first part of this dissertation, we develop two dependent variational inference methods for full posterior approximation in non-conjugate Bayesian models through hierarchical mixture- and copula-based variational proposals, respectively. The proposed methods move beyond the widely used factorized approximation to the posterior and provide generic applicability to a broad class of probabilistic models with minimal model-specific derivations. In the second part of this dissertation, we design probabilistic graphical models to accommodate multimodal data, describe dynamical behaviors and account for task heterogeneity. In particular, the sparse latent factor model is able to reveal common low-dimensional structures from high-dimensional data. We demonstrate the effectiveness of the proposed statistical learning methods on both synthetic and real-world data.
Item Open Access Building a patient-specific model using transfer learning for 4D-CBCT augmentation(2020) sun, leshanPurpose: Four-dimensional cone beam computed tomography (4D-CBCT) has been developed to provide respiratory phase‐resolved volumetric images in aid of image guided radiation therapy (IGRT), especially in SBRT, which requires highly accurate dose delivery. However, 4D-CBCT suffers from insufficient projection data in each phase bin, which leads to severe noise and artifact. To address this problem, deep learning methods have been introduced to help with augmenting image quality. However, when using traditional deep learning methods to augment CBCT images, the augmented images tend to lose small details such as lung textures. In this study, transfer learning method was proposed to further improve the image quality of the deep-learning augmented CBCT for one specific patient.
Methods: The network architecture used in this project for transfer learning is a standard U-net. CBCT images were reconstructed using limited projections that are simulated from ground truth CT images or directly from clinic. For transfer learning training process, the network was firstly fed with different patients’ data in order to learn a general restoration process to augment under-sampled CBCT images from any patients. Then, the restoration pattern was improved for one specific patient by re-feeding the network with this patient’s data from prior days. Performance of transfer learning was evaluated by comparing the augmented CBCT images to the traditional deep learning method’s images both qualitatively and quantitatively using structure similarity index matrix (SSIM) and peak signal-to-noise ratio (PSNR).
Regarding the study of effectiveness and time efficiency of transfer learning methods, two transfer learning methods, whole-layer fine tuning and layer-freezing methods, are compared to each other. Two training methods, whole-data tuning and sequential tuning were employed as well to further explore the possibility of improving transfer learning’s performance and reducing training time.
Results: The comparison demonstrated that the images augmented from transfer learning method not only recovered more detailed information in lung area but also had more uniform pixel value than basic U-net images when comparing to the ground truth. In addition, two transfer learning methods, whole-layers fine-tuning and layer-freezing method, and two training method, sequential training and all data training, were compared to each other, and all data training with layer-freezing method was found to be time-efficient with training time as short as 10 minutes. In the study of projection number’s effect, transfer-learning augmented CBCT images reconstructed from as low as 90 projection out of 900 projections showed its improvement from U-net augmented images.
Conclusion: Overall, transfer learning based image augmentation method is efficient and effective on improving image qualities of augmented under-sampled 3D/4D-CBCT images from traditional deep-learning methods. Given its relatively fast computational speeds and great performance, it can be very valuable for 4D image guided radiation therapy.
Item Open Access Learning to Transfer Knowledge from Multiple Sources of Electrophysiological Signals(2020) Li, YitongDeep learning methods have shown unparalleled performance when trained on vast amounts of diverse labeled training data, often collected at great cost. In many contexts, we have lots of labeled examples but only a few individuals, can be thought of as “little big data,” where we would like to take advantage of the large number of samples while still being cognizant of the fact that the number of observed groups is small. This problem is often known as domain adaptation or transfer learning.
In this dissertation, I will cover four major topics. Electroencephalography (EEG) and Local Field Potential (LFP) signals are “big” in terms of the size of recorded data but rarely have sufficient labels required to train complex models. Furthermore, they are collected from limited number of individuals. The first topic I will introduce an interpretable neuro model for electrophysiological signals and explain why transfer learning helps in real situations.
In the following two topics, I will expand the discussion of transfer learning problem with two real and challenging setups. Since data outliers will always exist in practice, many of the sources may be irrelevant to the target task, so ignoring the structure of the dataset is detrimental. Learning domain relationships are often insightful in their own right, and they allow domains to share strength without interference from irrelevant data. On top of the problems of outliers, label shift, where the percentage of data in each class is different between domains, is also essential for transfer learning.
Transfer learning needs target sample during the training stage, while this requirement may not be satisfied in practice. The last topic discusses the situation on generalizing a trained model on unseen testing samples, where each training domain has a unique classifier and each test data point is predicted by an admixture over the different domain classifiers.
Item Open Access Modeling and Methodological Advances in Causal Inference(2021) Zeng, ShuxiThis thesis presents several novel modeling or methodological advancements to causal inference. First, we investigate the use of propensity score weighting in the randomized trials for covariate adjustment. We introduce the class of balancing weights and study its theoretical property. We demonstrate that it is asymptotically equivalent to the analysis of covariance (ANCOVA) and derive the closed-form variance estimator. We further recommend the overlap weighting estimator based on its semiparametric efficiency and good finite-sample performance. Next, we focus on comparative effectiveness studies with survival outcomes. As opposed to the approach coupling with a Cox proportional hazards model, we follow an ``once for all'' approach and construct pseudo-observations of the censored outcomes. We study the theoretical property of propensity score weighting estimator based on pseudo-observations and provide closed-form variance estimators. The third contribution lies in the domain of causal mediation analysis, which studies how much of the treatment effect is mediated or explained through a given intermediate variable. The existing approaches are not directly applicable to scenario where both the mediator and outcome are measured on the sparse and irregular time grids. We propose a causal mediation framework by treating the sparse and irregular data as realizations of smooth processes and provide the assumptions for nonparametric identifications. We also provide a functional principal component analysis (FPCA) approach for estimation and carries out inference with a Bayesian paradigm. Furthermore, we study how to achieve double robustness with machine learning approaches. We develop a new algorithm that learns the double-robust representations in observational studies. The proposed method can learn the low-dimensional representations as well as the balancing weights simultaneously. Lastly, we study how to build a robust prediction model by exploiting the causal relationships. From a causal perspective, we argue robust models should capture the stable causal relationships as opposed to the spurious correlations. We propose a causal transfer random forest method learning the stable causal relationships efficiently from a large scale of observational data and a small amount of randomized data. We provide theoretical justifications and validate the algorithm empirically with synthetic experiments and real world prediction tasks.
In summary, this thesis makes contributions to the following three major areas in causal inference: (i) propensity score weighting methods for randomized experiments and observational studies, which consists of (a) randomized controlled trial (Chapter 2}) (b) survival outcome (Chapter 3); (ii) causal mediation analysis with sparse and irregular longitudinal data (Chapter 4); (iii) machine learning methods for causal inference, which consists of (a) double robustness (Chapter 5), (b) causal transfer random forest (Chapter 6).
Item Open Access Speaker Representation Learning under Self-supervised and Knowledge Transfer Setting(2023) Cai, DanweiSpeaker representation learning transforms speech signals into informative vectors, underpinning many audio applications. However, deep neural networks (DNNs), pivotal in this domain, falter with limited labeled data.
To overcome this, the thesis presents two primary strategies: self-supervised learning and knowledge transfer from automatic speech recognition (ASR). We introduce a two-stage self-supervised framework utilizing unlabeled data. The first stage focuses on representation learning, while the second integrates clustering and discriminative training. This framework is further streamlined by introducing the self-supervised reflective learning approach, central to which is self-supervised knowledge distillation, optimized to mitigate label noise effects. This approach significantly improves self-supervised speaker representation quality.
Leveraging the relationship between ASR and speaker verification, transfer learning methods are explored to use limited training data efficiently. Techniques include initializing with ASR-pretrained encoders, ASR-based knowledge distillation, and a speaker adaptor converting ASR features to speaker-specific ones.
Additionally, the thesis investigates voice conversion spoofing countermeasures, aiming to detect attacker identities behind conversions.
In essence, this research offers advancements in speaker representation learning, tackling data constraints, and enhancing security against voice spoofing, ultimately fortifying audio applications.
Item Open Access Task Affinity and Its Applications in Machine Learning(2023) Le, Cat PhuocTransfer learning has been an essential aspect of machine learning, in which the knowledge from previously trained tasks is utilized to learn the incoming tasks. Recent works on transfer learning primarily focus on learning algorithms for the scenario of a learned source task and a target task. Here, the goal is to identify the source model's functional layers and corresponding parameters to fine-tune with the target task's dataset. It is established in the literature that similar tasks can share equivalent learning models (e.g., architecture, number of layers, parameters.) As a result, selecting relevant data samples and the trained models for the target task is also essential to the success of transfer learning. However, this area of research has yet to be thoroughly studied. This work focuses on task affinity, which is a similarity measure between tasks, and its applications in machine learning through a transfer learning process. Based on the Fisher Information matrices, the proposed task affinity is non-symmetric by definition due to the fact that it is easier to transfer the knowledge from a complex and comprehensive task to a simple task than vice versa. The task affinity helps determine the relevant source tasks, their corresponding datasets, and trained models for the target tasks. Additionally, a meta-learning framework, whose goal is learning to learn, is introduced based on the proposed task affinity. This framework is designed for a scenario with multiple learned source tasks and a target task. Here, the artificial intelligent agent is assumed to have sufficiently extensive memory for storing learned tasks (e.g., trained models and datasets). The meta-learning framework allows this agent to identify relevant knowledge from the source tasks and quickly learn the target task without human domain knowledge. This framework is motivated by the learning process of humans, which starts by learning simple and basic tasks before tackling more advanced subjects. For instance, when solving complicated tasks, humans often relate them to more straightforward tasks. This framework also helps reduce the amount of required data samples from the target task and further boosts the model's performance. Overall, this dissertation presents the definitions of task affinity and the meta-learning frameworks for various applications in machine learning, such as neural architecture search, few-shot learning, image generation, and causal inference. The theoretical and empirical studies indicate the consistency of the task affinity and the efficacy of the proposed framework compared with other state-of-the-art approaches to machine learning applications.
Item Open Access Transfer Learning in Value-based Methods with Successor Features(2023) Nemecek, Mark WilliamThis dissertation investigates the concept of transfer learning in a reinforcement learning (RL) context. Transfer learning is based on the idea that it is possible for an agent to use what it has learned in one task to improve the learning process in another task as compared to learning from scratch. This improvement can take multiple forms, such as reducing the number of samples required to reach a given level of performance or even increasing the best performance achieved. In particular, we examine properties and applications of successor features, which are a useful representation that allows efficient calculation of action-value functions for a given policy in different contexts.
Our first contribution is a method for incremental construction of a cache of policies for a family of tasks. When a family of tasks share transition dynamics but differ in reward function, successor features allow us to efficiently compute the action-value functions for known policies in new tasks. As the optimal policy for a new task might be the same as or similar to that for a previous task, it is not always necessary for an agent to learn a new policy for each new task it encounters, especially if it is allowed some amount of suboptimality. We present new bounds for the performance of optimal policies in a new task, as well as an approach to use these bounds to decide, when presented with a new task, whether to use cached policies or learn a new policy.
In our second contribution, we examine the problem of hierarchical reinforcement learning, which involves breaking a task down into smaller subtasks which are easier to solve, through the lens of transfer learning. Within a single task, a subtask may encapsulate a behavior which could be used multiple times for completing the task, but occur in different contexts, such as opening doors while navigating a building. When the reward function changes between tasks, a given subtask may be unaffected, i.e., the optimal behavior within that subtask may remain the same. If so, the behavior may be immediately reused to accelerate training of behaviors for other subtasks. In both of these cases, reusing the learned behavior can be viewed as a transfer learning problem. We introduce a method based on the MAXQ value function decomposition which uses two applications of successor features to facilitate both transfer within a task and transfer between tasks with different reward functions.
The final contribution of this dissertation introduces a method for transfer using a value-based approach in domains with continuous actions. When an environment's action space is continuous, finding the action which maximizes an action-value function approximator efficiently often requires defining a constrained approximator which results in suboptimal behavior. Recently the RBF-DQN approach was proposed to use deep radial-basis value functions to allow efficient maximization of an action-value approximator over the actions while not losing the universal approximator property of neural networks. We present a method which extends this approach to use successor features in order to allow for effective transfer learning between tasks which differ in reward function.