Browsing by Subject "Online learning"
- Results Per Page
- Sort Options
Item Open Access Dynamic Mechanism Design in Complex Environments(2020) Deng, YuanInspired by various applications including ad auctions, matching markets, and voting, mechanism design deals with the problem of designing algorithms that take inputs from strategic agents and return an outcome optimizing a given objective, while taking the strategic behavior from the agents into account.
The focus of this thesis is to design mechanisms in dynamic environments that take into account rich constraints (e.g., budget constraints), features (e.g., robustness and credibility), and different types of agents (e.g., utility-maximizing agents and learning agents). Two main reasons why dynamic mechanism design is hard compared to mechanism design in a static environment are the need to make decisions in an online manner while the future might be unpredictable or even be chosen by an adversary arbitrarily, and the need to cope with strategic agents, who aim to maximize their cumulative utilities by looking into the future.
We propose a framework to design dynamic mechanisms with simple structures for utility-maximizing agents without losing any optimality, which facilitates both the design for the designer and the participation for the agents. Our framework enables the design of mechanisms achieving non-trivial performance guarantees relative to the optimal mechanism that has access to all future information in advance, even though our mechanisms are not equipped with any knowledge about the future. We further develop a class of dynamic mechanisms that are robust against estimation errors in agents' valuation distributions, a class of dynamic mechanisms that are credible so that the designer is incentivized to follow the rules, and a class of dynamic mechanisms for learning agents. In addition to dynamic mechanism design frameworks, we develop statistical tools to test whether a dynamic mechanism has correctly aligned the agents' incentives, and to measure the extent of the misalignment if it exists.
Item Open Access Learning Languages in Cyberspace: A case study of world languages programs in state virtual public schools(2018-12-05) Zhou, YueLearning foreign languages has not been a priority of U.S. K-12 education. National enrollment rate in World Languages courses remains low due to lack of funding, course offerings, and qualified teachers. The rapid development of virtual education in recent years provides potential solutions to challenges faced by World Languages programs but stakeholders also question the effectiveness of the virtual classroom. This mixed-methods case study provides an overview of World Languages programs in state virtual public schools nationwide and examines specifically North Carolina Virtual Public School (NCVPS) and its World Languages program. Analyzing the course offerings, enrollment pattern and enrollment changes of the NCVPS World Languages program over the last decade, the study finds that the program has made less commonly taught foreign languages more accessible to students and has benefited students from rural school districts in its initial years. The survey responses of NCVPS World Languages teachers along with four expert interviews reveal that the online program enjoys more resources such as private donation and partnership with universities compared to face-to-face classrooms. Many teachers expressed that the online and part-time nature of the program give them great flexibility. Analysis of teacher survey response also finds that holding students accountable is among the greatest challenges of virtual learning. Teacher opinions are mixed regarding whether learning a language online is better than learning in face-to-face classrooms.Item Open Access Topics in Online Markov Decision Processes(2015) Guan, PengThis dissertation describes sequential decision making problems in non-stationary environments. Online learning algorithms deal with non-stationary environments, but generally there is no notion of a dynamic state to model future impacts of past actions. State-based models are common in stochastic control settings, but well-known frameworks such as Markov decision processes (MDPs) assume a known stationary environment. In recent years, there has been a growing interest in fusing the above two important learning frameworks and considering an MDP setting in which the cost function is allowed to change arbitrarily over time. A number of online MDP algorithms have been designed to work under various assumptions about the dynamics of state transitions so far and provide performance guarantees, i.e. bounds on the regret defined as the performance gap between the total cost incurred by the learner and the total cost of the best available stationary policy that could have been chosen in hindsight.
However, most of the work in this area has been algorithmic: given a problem, one
would develop an algorithm almost from scratch and prove the performance guarantees on a case-by-case basis. Moreover, the presence of the state and the assumption of an arbitrarily varying environment complicate both the theoretical analysis and the development of computationally efficient methods. Another potential issue is that, by removing distributional assumptions about the mechanism generating the cost sequences, the existing methods have to consider the worst-case scenario, which may render their solutions too conservative in situations where the environment exhibits some degree of predictability.
This dissertation contributes several novel techniques to address the above challenges of the online MDP framework and opens up new research directions for online MDPs.
Our proposed general framework for deriving algorithms in the online MDP setting leads to a unifying view of existing methods and provides a general procedure for constructing new ones. Several new algorithms are developed and analyzed using this framework. We develop convex-analytical algorithms that take advantage of possible regularity of observed sequences, yet maintain the worst case performance guarantees. To further study the convex-analytic methods we applied above, we take a step back to consider the traditional MDP problem and extend the LP approach to MDPs by adding a relative entropy regularization term. A computationally efficient algorithm for this class of MDPs is constructed under mild assumptions on the state transition models. Two-player zero-sum stochastic games are also investigated in this dissertation as an important extension of the online MDP setting. In short, this dissertation provides in-depth analysis of the online MDP problem and answers several important questions in this field.