Withdraw
Loading…
Safe reinforcement learning: An overview, a hybrid systems perspective, and a case study
Potok, Matthew
Loading…
Permalink
https://hdl.handle.net/2142/102518
Description
- Title
- Safe reinforcement learning: An overview, a hybrid systems perspective, and a case study
- Author(s)
- Potok, Matthew
- Issue Date
- 2018-12-12
- Director of Research (if dissertation) or Advisor (if thesis)
- Mitra, Sayan
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Reinforcement Learning
- Abstract
- Reinforcement learning (RL) is a general method for agents to learn optimal control policies through exploration and experience. Due to its generality, RL can generate novel policies that may not be easily expressed with rules-based strategies or traditional control techniques. Over the years since its inception, RL has been able to solve increasingly more challenging control problems, from GridWorld to Go. Despite these impressive results, the successes of RL have been predominantly limited to systems with discrete environments and agents, particularly video and board games. A key barrier to using RL in safety-critical cyber-physical system applications is not only transferring these results to continuous domains but also ensuring that a notion of `safety' is upheld during the learning process. This thesis highlights some of the recent contributions in safe learning and presents a framework, FoRShield, for learning safe policies of a control system with nonlinear dynamics. The framework develops a generic hybrid systems model for online RL. The model is used to formalize a shield that can filter unsafe action choices and proved feedback to the underlying RL system. The thesis presents a concrete approach for computing the shield utilizing existing reachability analysis tools. The feasibility of this approach is illustrated against a case study with a quadcopter that uses RL to discover a safe and optimal plan for a dynamic fire-fighting task. The approach is realized as an open-source framework, FoRShield. The framework is implemented in Python in a modular fashion to allow for testing of a variety of algorithms. Our particular implementation utilizes the Actor-Critic algorithm to learn policies. The experiments show that interesting fire-fighting strategies can be safely learned for a discrete environment with 2^32 states and a 9-dimensional plant model using a standard laptop computer.
- Graduation Semester
- 2018-12
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/102518
- Copyright and License Information
- Copyright 2018 Matthew Potok
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…