Risk-averse multi-armed bandits and game theory

Yekkehkhany, Ali

Risk-averse multi-armed bandits and game theory

Yekkehkhany, Ali

Permalink

https://hdl.handle.net/2142/108439

Description

Title

Risk-averse multi-armed bandits and game theory

Author(s)

Yekkehkhany, Ali

Issue Date

2020-07-01

Director of Research (if dissertation) or Advisor (if thesis)

Nagi, Rakesh

Doctoral Committee Chair(s)

Nagi, Rakesh

Committee Member(s)

Hajek, Bruce
Shomorony, Ilan
Srikant, Rayadurgam

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Online Learning
Multi-Armed Bandits
Exploration-Exploitation
Explore-Then-Commit Bandits
Risk-Aversion
Game Theory
Stochastic Game Theory
Congestion Games
Affinity Scheduling
MapReduce
Data Center

Abstract

The multi-armed bandit (MAB) and game theory literature is mainly focused on the expected cumulative reward and the expected payoffs in a game, respectively. In contrast, the rewards and the payoffs are often random variables whose expected values only capture a vague idea of the overall distribution. The focus of this dissertation is to study the fundamental limits of the existing bandits and game theory problems in a risk-averse framework and propose new ideas that address the shortcomings. The author believes that human beings are mostly risk-averse, so studying multi-armed bandits and game theory from the point of view of risk aversion, rather than expected reward/payoff, better captures reality. In this manner, a specific class of multi-armed bandits, called explore-then-commit bandits, and stochastic games are studied in this dissertation, which are based on the notion of Risk-Averse Best Action Decision with Incomplete Information (R-ABADI, Abadi is the maiden name of the author's mother). The goal of the classical multi-armed bandits is to exploit the arm with the maximum score defined as the expected value of the arm reward. Instead, we propose a new definition of score that is derived from the joint distribution of all arm rewards and captures the reward of an arm relative to those of all other arms. We use a similar idea for games and propose a risk-averse R-ABADI equilibrium in game theory that is possibly different from the Nash equilibrium. The payoff distributions are taken into account to derive the risk-averse equilibrium, while the expected payoffs are used to find the Nash equilibrium. The fundamental properties of games, e.g. pure and mixed risk-averse R-ABADI equilibrium and strict dominance, are studied in the new framework and the results are expanded to finite-time games. Furthermore, the stochastic congestion games are studied from a risk-averse perspective and three classes of equilibria are proposed for such games. It is shown by examples that the risk-averse behavior of travelers in a stochastic congestion game can improve the price of anarchy in Pigou and Braess networks. Furthermore, the Braess paradox does not occur to the extent proposed originally when travelers are risk-averse. We also study an online affinity scheduling problem with no prior knowledge of the task arrival rates and processing rates of different task types on different servers. We propose the Blind GB-PANDAS algorithm that utilizes an exploration-exploitation scheme to load balance incoming tasks on servers in an online fashion. We prove that Blind GB-PANDAS is throughput optimal, i.e. it stabilizes the system as long as the task arrival rates are inside the capacity region. The Blind GB-PANDAS algorithm is compared to FCFS, Max-Weight, and c-mu-rule algorithms in terms of average task completion time through simulations, where the same exploration-exploitation approach as Blind GB-PANDAS is used for Max-Weight and c-$\mu$-rule. The extensive simulations show that the Blind GB-PANDAS algorithm conspicuously outperforms the three other algorithms at high loads.

Graduation Semester

2020-08

Type of Resource

Thesis

Permalink

http://hdl.handle.net/2142/108439

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Electrical and Computer Engineering

Dissertations and Theses in Electrical and Computer Engineering

Risk-averse multi-armed bandits and game theory

Yekkehkhany, Ali

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Log In