Safe reinforcement learning: An overview, a hybrid systems perspective, and a case study

Potok, Matthew

Safe reinforcement learning: An overview, a hybrid systems perspective, and a case study

Potok, Matthew

Content Files

POTOK-THESIS-2018.pdf

Permalink

https://hdl.handle.net/2142/102518

Description

Title: Safe reinforcement learning: An overview, a hybrid systems perspective, and a case study
Author(s): Potok, Matthew
Issue Date: 2018-12-12
Director of Research (if dissertation) or Advisor (if thesis): Mitra, Sayan
Department of Study: Electrical & Computer Eng
Discipline: Electrical & Computer Engr
Degree Granting Institution: University of Illinois at Urbana-Champaign
Degree Name: M.S.
Degree Level: Thesis
Date of Ingest: 2019-02-06T19:36:44Z
Keyword(s): Reinforcement Learning
Abstract: Reinforcement learning (RL) is a general method for agents to learn optimal control policies through exploration and experience. Due to its generality, RL can generate novel policies that may not be easily expressed with rules-based strategies or traditional control techniques. Over the years since its inception, RL has been able to solve increasingly more challenging control problems, from GridWorld to Go. Despite these impressive results, the successes of RL have been predominantly limited to systems with discrete environments and agents, particularly video and board games. A key barrier to using RL in safety-critical cyber-physical system applications is not only transferring these results to continuous domains but also ensuring that a notion of `safety' is upheld during the learning process. This thesis highlights some of the recent contributions in safe learning and presents a framework, FoRShield, for learning safe policies of a control system with nonlinear dynamics. The framework develops a generic hybrid systems model for online RL. The model is used to formalize a shield that can filter unsafe action choices and proved feedback to the underlying RL system. The thesis presents a concrete approach for computing the shield utilizing existing reachability analysis tools. The feasibility of this approach is illustrated against a case study with a quadcopter that uses RL to discover a safe and optimal plan for a dynamic fire-fighting task. The approach is realized as an open-source framework, FoRShield. The framework is implemented in Python in a modular fashion to allow for testing of a variety of algorithms. Our particular implementation utilizes the Actor-Critic algorithm to learn policies. The experiments show that interesting fire-fighting strategies can be safely learned for a discrete environment with 2^32 states and a 9-dimensional plant model using a standard laptop computer.
Graduation Semester: 2018-12
Type of Resource: text
Permalink: http://hdl.handle.net/2142/102518

Safe reinforcement learning: An overview, a hybrid systems perspective, and a case study

Potok, Matthew

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Safe reinforcement learning: An overview, a hybrid systems perspective, and a case study

Potok, Matthew

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Log In