Reinforcement learning with offline data: foundations and algorithms

Xie, Tengyang

Reinforcement learning with offline data: foundations and algorithms

Xie, Tengyang

This item's files can only be accessed by the Administrator group.

Permalink

https://hdl.handle.net/2142/121258

Description

Title

Reinforcement learning with offline data: foundations and algorithms

Author(s)

Xie, Tengyang

Issue Date

2023-07-13

Director of Research (if dissertation) or Advisor (if thesis)

Jiang, Nan

Doctoral Committee Chair(s)

Jiang, Nan

Committee Member(s)

Forsyth, David
Srikant, Rayadurgam
Brunskill, Emma
Chen, Xi

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

artificial intelligence
machine learning
reinforcement learning
statistical learning theory

Abstract

Reinforcement Learning (RL) has displayed its formidable capabilities across a plethora of complex challenges, including mastering the game of Go (Silver et al., 2016, 2017) and strategically navigating video games such as StarCraft II (Vinyals et al., 2019). Despite these triumphs in simulated environments, the real-world impact of reinforcement learning is yet to fully materialize, in stark contrast to its simulated successes. This dissertation sets out to take steps towards a lofty, long-term objective: to transfer the impressive achievements of reinforcement learning from simulated environments into real-world scenarios. Unlike supervised learning, which is a paradigm naturally attuned to data-driven applications and already prevalent in a multitude of real-world settings, reinforcement learning is innately interactive. This trait poses one of the most significant impediments to the broader real-world adoption of reinforcement learning. The interactive nature of reinforcement learning necessitates an iterative learning procedure involving the continuous collection of experiential data from the environment, as well as learning from the collected data, primarily by implementing the most recently learned policy for both evaluation and improvement. This degree of online interaction, often necessitating a high degree of adaptivity or policy switching, is frequently unfeasible in real-world contexts due to the expensive or even risky data collection procedures involved, such as in robotics, autonomous driving, or healthcare. Addressing this issue of low adaptivity has been a longstanding focus in reinforcement learning research, with numerous studies examining off-policy learning, which permits a divergence between the data-collection policy and the target policy. Yet, most off-policy reinforcement learning techniques typically struggle with the problem of so-called offline reinforcement learning, which studies reinforcement learning using purely offline data. This dissertation, therefore, targets on a complete solution for reinforcement learning with offline data. The proposed approaches are primarily motivated by the aforementioned challenges and aim to enable the data-driven learning paradigm of reinforcement learning from both a theoretical and an algorithmic perspective. To attain such goal, this dissertation investigates the following three main topics: Part I: Foundations of Offline Reinforcement Learning. This part answers several fundamental questions about offline RL, including the minimal approximation assumption for conducting offline RL, as well as the fundamental algorithmic concept of offline RL. Part II: Algorithms of Offline Reinforcement Learning. This part presents algorithms for offline RL that are both provably efficient and capable of achieving strong performance. In addition, these algorithms also attain desirable properties for practical applications, especially for achieving the best of both worlds from offline RL and imitation learning. Part III: Bridging Offline and Online Reinforcement Learning. The last part of this dissertation goes beyond the purely offline setting and works towards bridging sample-efficient online and offline reinforcement learning. This part investigates the perspectives of bridging algorithmic concepts and learnability conditions for online and offline RL.

Graduation Semester

2023-08

Type of Resource

Thesis

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Reinforcement learning with offline data: foundations and algorithms

Xie, Tengyang

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In