No-Regret Learning in Unknown Games with Correlated Payoffs

Sessa, Pier Giuseppe; Bogunovic, Ilija; Kamgarpour, Maryam; Krause, Andreas

Computer Science > Machine Learning

arXiv:1909.08540 (cs)

[Submitted on 18 Sep 2019 (v1), last revised 28 Oct 2019 (this version, v2)]

Title:No-Regret Learning in Unknown Games with Correlated Payoffs

Authors:Pier Giuseppe Sessa, Ilija Bogunovic, Maryam Kamgarpour, Andreas Krause

View PDF

Abstract:We consider the problem of learning to play a repeated multi-agent game with an unknown reward function. Single player online learning algorithms attain strong regret bounds when provided with full information feedback, which unfortunately is unavailable in many real-world scenarios. Bandit feedback alone, i.e., observing outcomes only for the selected action, yields substantially worse performance. In this paper, we consider a natural model where, besides a noisy measurement of the obtained reward, the player can also observe the opponents' actions. This feedback model, together with a regularity assumption on the reward function, allows us to exploit the correlations among different game outcomes by means of Gaussian processes (GPs). We propose a novel confidence-bound based bandit algorithm GP-MW, which utilizes the GP model for the reward function and runs a multiplicative weight (MW) method. We obtain novel kernel-dependent regret bounds that are comparable to the known bounds in the full information setting, while substantially improving upon the existing bandit results. We experimentally demonstrate the effectiveness of GP-MW in random matrix games, as well as real-world problems of traffic routing and movie recommendation. In our experiments, GP-MW consistently outperforms several baselines, while its performance is often comparable to methods that have access to full information feedback.

Subjects:	Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA); Machine Learning (stat.ML)
Cite as:	arXiv:1909.08540 [cs.LG]
	(or arXiv:1909.08540v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1909.08540

Submission history

From: Pier Giuseppe Sessa [view email]
[v1] Wed, 18 Sep 2019 16:09:09 UTC (606 KB)
[v2] Mon, 28 Oct 2019 09:08:04 UTC (443 KB)

Computer Science > Machine Learning

Title:No-Regret Learning in Unknown Games with Correlated Payoffs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:No-Regret Learning in Unknown Games with Correlated Payoffs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators