We propose a privacy-preserving algorithm for policy evaluation in multiagent reinforcement learning. We consider the collaborative setting that each agent wants to learn the aggregate discount cost for a control policy, while protecting the privacy of local cost functions. We employ the notion of privacy via non-identifiability and propose the use of network correlated perturbations that cancel over the network to hide locally incurred cost at each time step from honest-but-curious adversaries while preserving the important part of environment structure, i.e. the aggregate cost. We show that our algorithm preserves the original finite time analysis result and protects the privacy of local cost functions under certain graph conditions.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.