Browsing by Subject "Bayesian learning"
- Results Per Page
- Sort Options
Item Open Access Constrained Discretion and Central Bank Transparency(Economic Research Initiatives at Duke (ERID), 2015-02-01) Bianchi, F; Melosi, LWe develop and estimate a general equilibrium model to quantitatively assess the effects and welfare implications of central bank transparency. Monetary policy can deviate from active inflation stabilization and agents conduct Bayesian learning about the nature of these deviations. Under constrained discretion, only short deviations occur, agents' uncertainty about the macroeconomy remains contained, and welfare is high. However, if a deviation persists, uncertainty accelerates and welfare declines. Announcing the future policy course raises uncertainty in the short run by revealing that active inflation stabilization will be temporarily abandoned. However, this announcement reduces policy uncertainty and anchors inflationary beliefs at the end of the policy. For the U.S. enhancing transparency is found to increase welfare.Item Open Access Designing Subscription Services with Imperfect Information and Dynamic Learning(2021) Kao, Yuan-MaoThis dissertation studies how a subscription service provider offers contracts to customers without full information on their preferences. The first essay studies a mechanism design problem for business interruption (BI) insurance. More specifically, we study how an insurer deals with adverse selection and moral hazard when offering BI insurance to a firm. The firm makes demand forecasts and can make a recovery effort if a disruption occurs; both are unobservable to the insurer. We first find that because of the joint effect of limited production capacity and self-impelled recovery effort, the firm with a lower demand forecast benefits more from the BI insurance than that with a higher demand forecast. Anticipating a higher premium, the low-demand firm has an incentive to pretend to have the higher demand forecast to obtain more profit. We then characterize the optimal insurance contracts to deal with information asymmetry and show how the firm's operational and informational characteristics affect the optimal insurance contracts. We also analyze the case where the firm can choose its initial capacity and find that from the firm's perspective, capacity and BI insurance could be either substitutes or complements.The second essay focuses on the learning-and-earning trade-off in subscription service offerings. We consider a service provider offering a subscription service to customers over a multi-period planning horizon. The customers decide whether to subscribe according to a utility model representing their preferences for the subscription service. The provider has a prior belief about the customer utility model. Adjusting the price and subscription period over time, the provider updates its belief based on the transaction data of new customers and the usage data of existing subscribers. The provider aims to minimize its regret, namely the expected profit loss relative to a clairvoyant with full information on the customer utility model. To analyze regret, we first study the clairvoyant's full-information problem. The resulting dynamic program, however, suffers from the curse of dimensionality. We develop a customer-centric approach to resolve this issue and obtain the optimal policy for the full-information problem. This approach strikes an optimal balance between immediate and future profits from an individual customer. When the provider does not have full information, we find that a simple and commonly used certainty-equivalence policy, which learns only passively, exhibits poor performance. We illustrate that this can be due to incomplete or slow learning, but it can also occur because of offering a suboptimal contract with a long subscription period in the beginning. We propose a two-phase learning policy that first focuses on information accumulation and then profit maximization. We show that our policy achieves asymptotically optimal performance with its regret growing logarithmically in the planning horizon. Our results indicate that the provider should be cautious about offering a long subscription period when it is uncertain about customer preferences.
Item Open Access Dynamic Models of Human Capital Accumulation(2015) Ransom, TylerThis dissertation consists of three separate essays that use dynamic models to better understand the human capital accumulation process. First, I analyze the role of migration in human capital accumulation and how migration varies over the business cycle. An interesting trend in the data is that, over the period of the Great Recession, overall migration rates in the US remained close to their respective long-term trends. However, migration evolved differently by employment status: unemployed workers were more likely to migrate during the recession and employed workers less likely. To isolate mechanisms explaining this divergence, I estimate a dynamic, non-stationary search model of migration using a national longitudinal survey from 2004-2013. I focus on the role of employment frictions on migration decisions in addition to other explanations in the literature. My results show that a divergence in job offer and job destruction rates caused differing migration incentives by employment status. I also find that migration rates were muted because of the national scope of the Great Recession. Model simulations show that spatial unemployment insurance in the form of a moving subsidy can help workers move to more favorable markets.
In the second essay, my coauthors and I explore the role of information frictions in the acquisition of human capital. Specifically, we investigate the determinants of college attrition in a setting where individuals have imperfect information about their schooling ability and labor market productivity. We estimate a dynamic structural model of schooling and work decisions, where high school graduates choose a bundle of education and work combinations. We take into account the heterogeneity in schooling investments by distinguishing between two- and four-year colleges and graduate school, as well as science and non-science majors for four-year colleges. Individuals may also choose whether to work full-time, part-time, or not at all. A key feature of our approach is to account for correlated learning through college grades and wages, thus implying that individuals may leave or re-enter college as a result of the arrival of new information on their ability and/or productivity. We use our results to quantify the importance of informational frictions in explaining the observed school-to-work transitions and to examine sorting patterns.
In the third essay, my coauthors and I investigate the evolution over the last two decades in the wage returns to schooling and early work experience.
Using data from the 1979 and 1997 panels of the National Longitudinal Survey of Youth, we isolate changes in skill prices from changes in composition by estimating a dynamic model of schooling and work decisions. Importantly, this allows us to account for the endogenous nature of the changes in educational and accumulated work experience over this time period. We find an increase over this period in the returns to working in high school, but a decrease in the returns to working while in college. We also find an increase in the incidence of working in college, but that any detrimental impact of in-college work experience is offset by changes in other observable characteristics. Overall, our decomposition of the evolution in skill premia suggests that both price and composition effects play an important role. The role of unobserved ability is also important.
Item Open Access Hierarchical Bayesian Learning Approaches for Different Labeling Cases(2015) Manandhar, AchutThe goal of a machine learning problem is to learn useful patterns from observations so that appropriate inference can be made from new observations as they become available. Based on whether labels are available for training data, a vast majority of the machine learning approaches can be broadly categorized into supervised or unsupervised learning approaches. In the context of supervised learning, when observations are available as labeled feature vectors, the learning process is a well-understood problem. However, for many applications, the standard supervised learning becomes complicated because the labels for observations are unavailable as labeled feature vectors. For example, in a ground penetrating radar (GPR) based landmine detection problem, the alarm locations are only known in 2D coordinates on the earth's surface but unknown for individual target depths. Typically, in order to apply computer vision techniques to the GPR data, it is convenient to represent the GPR data as a 2D image. Since a large portion of the image does not contain useful information pertaining to the target, the image is typically further subdivided into subimages along depth. These subimages at a particular alarm location can be considered as a set of observations, where the label is only available for the entire set but unavailable for individual observations along depth. In the absence of individual observation labels, for the purposes of training standard supervised learning approaches, observations both above and below the target are labeled as targets despite substantial differences in their characteristics. As a result, the label uncertainty with depth would complicate the parameter inference in the standard supervised learning approaches, potentially degrading their performance. In this work, we develop learning algorithms for three such specific scenarios where: (1) labels are only available for sets of independent and identically distributed (i.i.d.) observations, (2) labels are only available for sets of sequential observations, and (3) continuous correlated multiple labels are available for spatio-temporal observations. For each of these scenarios, we propose a modification in a traditional learning approach to improve its predictive accuracy. The first two algorithms are based on a set-based framework called as multiple instance learning (MIL) whereas the third algorithm is based on a structured output-associative regression (SOAR) framework. The MIL approaches are motivated by the landmine detection problem using GPR data, where the training data is typically available as labeled sets of observations or sets of sequences. The SOAR learning approach is instead motivated by the multi-dimensional human emotion label prediction problem using audio-visual data, where the training data is available in the form of multiple continuous correlated labels representing complex human emotions. In both of these applications, the unavailability of the training data as labeled featured vectors motivate developing new learning approaches that are more appropriate to model the data.
A large majority of the existing MIL approaches require computationally expensive parameter optimization, do not generalize well with time-series data, and are incapable of online learning. To overcome these limitations, for sets of observations, this work develops a nonparametric Bayesian approach to learning in MIL scenarios based on Dirichlet process mixture models. The nonparametric nature of the model and the use of non-informative priors remove the need to perform cross-validation based optimization while variational Bayesian inference allows for rapid parameter learning. The resulting approach is highly generalizable and also capable of online learning. For sets of sequences, this work integrates Hidden Markov models (HMMs) into an MIL framework and develops a new approach called the multiple instance hidden Markov model. The model parameters are inferred using variational Bayes, making the model tractable and computationally efficient. The resulting approach is highly generalizable and also capable of online learning. Similarly, most of the existing approaches developed for modeling multiple continuous correlated emotion labels do not model the spatio-temporal correlation among the emotion labels. Few approaches that do model the correlation fail to predict the multiple emotion labels simultaneously, resulting in latency during testing, and potentially compromising the effectiveness of implementing the approach in real-time scenario. This work integrates the output-associative relevance vector machine (OARVM) approach with the multivariate relevance vector machine (MVRVM) approach to simultaneously predict multiple emotion labels. The resulting approach performs competitively with the existing approaches while reducing the prediction time during testing, and the sparse Bayesian inference allows for rapid parameter learning. Experimental results on several synthetic datasets, benchmark datasets, GPR-based landmine detection datasets, and human emotion recognition datasets show that our proposed approaches perform comparably or better than the existing approaches.
Item Open Access Towards Better Representations with Deep/Bayesian Learning(2018) Li, ChunyuanDeep learning and Bayesian Learning are two popular research topics in machine learning. They provide the flexible representations in the complementary manner. Therefore, it is desirable to take the best from both fields. This thesis focuses on the intersection of the two topics— enriching one with each other. Two new research topics are inspired: Bayesian deep learning and Deep Bayesian learning.
In Bayesian deep learning, scalable Bayesian methods are proposed to learn the weight uncertainty of deep neural networks (DNNs). On this topic, I propose the preconditioned stochastic gradient MCMC methods, then show its connection to Dropout, and its applications to modern network architectures in computer vision and natural language processing.
In Deep Bayesian learning: DNNs are employed as powerful representations of conditionals in traditional Bayesian models. I will focus on understanding the recent adversarial learning methods for joint distribution matching, through which several recent bivariate adversarial models are unified. It further raises the non-identifiability issues in bidirectional adversarial learning, and propose ALICE algorithms: a conditional entropy framework to remedy the issues. The derived algorithms show significant improvement in the tasks of image generation and translation, by solving the non-identifiability issues.