Hierarchical Bayesian Learning Approaches for Different Labeling Cases
Repository Usage Stats
The goal of a machine learning problem is to learn useful patterns from observations so that appropriate inference can be made from new observations as they become available. Based on whether labels are available for training data, a vast majority of the machine learning approaches can be broadly categorized into supervised or unsupervised learning approaches. In the context of supervised learning, when observations are available as labeled feature vectors, the learning process is a well-understood problem. However, for many applications, the standard supervised learning becomes complicated because the labels for observations are unavailable as labeled feature vectors. For example, in a ground penetrating radar (GPR) based landmine detection problem, the alarm locations are only known in 2D coordinates on the earth's surface but unknown for individual target depths. Typically, in order to apply computer vision techniques to the GPR data, it is convenient to represent the GPR data as a 2D image. Since a large portion of the image does not contain useful information pertaining to the target, the image is typically further subdivided into subimages along depth. These subimages at a particular alarm location can be considered as a set of observations, where the label is only available for the entire set but unavailable for individual observations along depth. In the absence of individual observation labels, for the purposes of training standard supervised learning approaches, observations both above and below the target are labeled as targets despite substantial differences in their characteristics. As a result, the label uncertainty with depth would complicate the parameter inference in the standard supervised learning approaches, potentially degrading their performance. In this work, we develop learning algorithms for three such specific scenarios where: (1) labels are only available for sets of independent and identically distributed (i.i.d.) observations, (2) labels are only available for sets of sequential observations, and (3) continuous correlated multiple labels are available for spatio-temporal observations. For each of these scenarios, we propose a modification in a traditional learning approach to improve its predictive accuracy. The first two algorithms are based on a set-based framework called as multiple instance learning (MIL) whereas the third algorithm is based on a structured output-associative regression (SOAR) framework. The MIL approaches are motivated by the landmine detection problem using GPR data, where the training data is typically available as labeled sets of observations or sets of sequences. The SOAR learning approach is instead motivated by the multi-dimensional human emotion label prediction problem using audio-visual data, where the training data is available in the form of multiple continuous correlated labels representing complex human emotions. In both of these applications, the unavailability of the training data as labeled featured vectors motivate developing new learning approaches that are more appropriate to model the data.
A large majority of the existing MIL approaches require computationally expensive parameter optimization, do not generalize well with time-series data, and are incapable of online learning. To overcome these limitations, for sets of observations, this work develops a nonparametric Bayesian approach to learning in MIL scenarios based on Dirichlet process mixture models. The nonparametric nature of the model and the use of non-informative priors remove the need to perform cross-validation based optimization while variational Bayesian inference allows for rapid parameter learning. The resulting approach is highly generalizable and also capable of online learning. For sets of sequences, this work integrates Hidden Markov models (HMMs) into an MIL framework and develops a new approach called the multiple instance hidden Markov model. The model parameters are inferred using variational Bayes, making the model tractable and computationally efficient. The resulting approach is highly generalizable and also capable of online learning. Similarly, most of the existing approaches developed for modeling multiple continuous correlated emotion labels do not model the spatio-temporal correlation among the emotion labels. Few approaches that do model the correlation fail to predict the multiple emotion labels simultaneously, resulting in latency during testing, and potentially compromising the effectiveness of implementing the approach in real-time scenario. This work integrates the output-associative relevance vector machine (OARVM) approach with the multivariate relevance vector machine (MVRVM) approach to simultaneously predict multiple emotion labels. The resulting approach performs competitively with the existing approaches while reducing the prediction time during testing, and the sparse Bayesian inference allows for rapid parameter learning. Experimental results on several synthetic datasets, benchmark datasets, GPR-based landmine detection datasets, and human emotion recognition datasets show that our proposed approaches perform comparably or better than the existing approaches.
DepartmentElectrical and Computer Engineering
Multiple instance learning
Multivariate output-associative regression
More InfoShow full item record
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations