Coordination Failure in Cooperative Offline MARL

Tilbury, Callum Rhys; Formanek, Claude; Beyers, Louise; Shock, Jonathan P.; Pretorius, Arnu

Computer Science > Machine Learning

arXiv:2407.01343 (cs)

[Submitted on 1 Jul 2024]

Title:Coordination Failure in Cooperative Offline MARL

Authors:Callum Rhys Tilbury, Claude Formanek, Louise Beyers, Jonathan P. Shock, Arnu Pretorius

View PDF HTML (experimental)

Abstract:Offline multi-agent reinforcement learning (MARL) leverages static datasets of experience to learn optimal multi-agent control. However, learning from static data presents several unique challenges to overcome. In this paper, we focus on coordination failure and investigate the role of joint actions in multi-agent policy gradients with offline data, focusing on a common setting we refer to as the 'Best Response Under Data' (BRUD) approach. By using two-player polynomial games as an analytical tool, we demonstrate a simple yet overlooked failure mode of BRUD-based algorithms, which can lead to catastrophic coordination failure in the offline setting. Building on these insights, we propose an approach to mitigate such failure, by prioritising samples from the dataset based on joint-action similarity during policy learning and demonstrate its effectiveness in detailed experiments. More generally, however, we argue that prioritised dataset sampling is a promising area for innovation in offline MARL that can be combined with other effective approaches such as critic and policy regularisation. Importantly, our work shows how insights drawn from simplified, tractable games can lead to useful, theoretically grounded insights that transfer to more complex contexts. A core dimension of offering is an interactive notebook, from which almost all of our results can be reproduced, in a browser.

Comments:	Accepted at the Workshop on Aligning Reinforcement Learning Experimentalists and Theorists (ARLET) at the International Conference on Machine Learning, 2024
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Cite as:	arXiv:2407.01343 [cs.LG]
	(or arXiv:2407.01343v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.01343

Submission history

From: Callum Rhys Tilbury [view email]
[v1] Mon, 1 Jul 2024 14:51:29 UTC (5,821 KB)

Computer Science > Machine Learning

Title:Coordination Failure in Cooperative Offline MARL

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Coordination Failure in Cooperative Offline MARL

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators