Toward Assured Autonomy with Model-Free Reinforcement Learning
Abstract
Autonomous systems (AS), enhanced by the capabilities of reinforcement learning (RL), are expected to perform increasingly sophisticated tasks across various civilian and industrial application domains. This expectation arises from their promising ability to make decisions solely based on perception without human intervention. In addition to high efficiency, AS often require robustness and safety guarantees for real-world deployment. In this thesis, we propose model-free RL approaches that obtain controllers for AS operating in unknown, stochastic, and potentially adversarial environments directly from linear temporal logic (LTL) specifications defined on state labels, such as safety and liveness requirements. This ensures that the learned controllers satisfy the desired properties, avoiding unintended consequences, and remain robust against adversarial behavior.We first derive a novel rewarding and discounting mechanism from the LTL specifications for Markov decision processes. We show that a policy learned by a model-free RL algorithm, which maximizes the sum of these discounted rewards, also maximizes the probability of satisfying the LTL specifications. We generalize this approach to multiple objectives, where the utmost priority is given to ensuring safety. Satisfaction of the other LTL specifications takes a secondary role, and the tertiary objective is to enhance the quality of control. We then extend our results to zero-sum stochastic games to ensure the robustness of learned controllers against any unpredictable nondeterministic environment behavior. Addressing the scalability challenges inherent in learning controllers for stochastic games, we propose heuristics and approximate methods to further accelerate the learning process. We illustrate how our approach can be utilized to learn controllers that are resilient against stealthy attackers, capable of disrupting the agent's actuation without being detected. We further discuss an approach for cases where state labels are absent. This approach aims to learn a labeling function that translates raw state information into object properties applicable in LTL specifications, thereby enabling the learning of controllers from LTL specifications. We conclusively show the effectiveness of our approaches in successfully learning optimal controllers through numerous case studies. These controllers maximize the probability of satisfying LTL specifications in the worst case, thereby exhibiting resilience against adversarial behavior. Moreover, our methods demonstrate scalability across a broad spectrum of LTL specifications, consistently surpassing the performance of existing approaches.
Type
Department
Description
Provenance
Citation
Permalink
Citation
Bozkurt, Alper Kamil (2024). Toward Assured Autonomy with Model-Free Reinforcement Learning. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/30808.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.