Show simple item record

Deep Reinforcement Learning with Temporal Logic Specifications

dc.contributor.advisor Zavlanos, Michael M
dc.contributor.author Gao, Qitong
dc.date.accessioned 2018-05-31T21:18:58Z
dc.date.available 2018-11-15T09:17:12Z
dc.date.issued 2018
dc.identifier.uri https://hdl.handle.net/10161/17056
dc.description.abstract <p>In this thesis, we propose a model-free reinforcement learning method to synthesize control policies for mobile robots modeled by Markov Decision Process (MDP) with unknown transition probabilities that satisfy Linear Temporal Logic (LTL) specifications. The key idea is to employ Deep Q-Learning techniques that rely on Neural Networks (NN) to approximate the state-action values of the MDP and design a reward function that depends on the accepting condition of the Deterministic Rabin Automaton (DRA) that captures the LTL specification. Unlike relevant works, our method does not require learning the transition probabilities in the MDP, constructing a product MDP, or computing Accepting Maximal End Components (AMECs). This significantly reduces the computational cost and also renders our method applicable to planning problems where AMECs do not exist. In this case, the resulting control policies minimize the frequency with which the system enters bad states in the DRA that violate the task specifications. To the best of our knowledge, this is the first model-free deep reinforcement learning algorithm that can synthesize policies that maximize the probability of satisfying an LTL specification even if AMECs do not exist. We validate our method through numerical experiments.</p>
dc.subject Mechanical engineering
dc.title Deep Reinforcement Learning with Temporal Logic Specifications
dc.type Master's thesis
dc.department Mechanical Engineering and Materials Science
duke.embargo.months 5


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record