Computer Science > Computer Science and Game Theory
[Submitted on 2 Feb 2024 (v1), last revised 28 Mar 2024 (this version, v2)]
Title:The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games
View PDF HTML (experimental)Abstract:The increasing prevalence of multi-agent learning systems in society necessitates understanding how to learn effective and safe policies in general-sum multi-agent environments against a variety of opponents, including self-play. General-sum learning is difficult because of non-stationary opponents and misaligned incentives. Our first main contribution is to show that many recent approaches to general-sum learning can be derived as approximations to Stackelberg strategies, which suggests a framework for developing new multi-agent learning algorithms. We then define non-coincidental games as games in which the Stackelberg strategy profile is not a Nash Equilibrium. This notably includes several canonical matrix games and provides a normative theory for why existing algorithms fail in self-play in such games. We address this problem by introducing Welfare Equilibria (WE) as a generalisation of Stackelberg Strategies, which can recover desirable Nash Equilibria even in non-coincidental games. Finally, we introduce Welfare Function Search (WelFuSe) as a practical approach to finding desirable WE against unknown opponents, which finds more mutually desirable solutions in self-play, while preserving performance against naive learning opponents.
Submission history
From: Jake Levi [view email][v1] Fri, 2 Feb 2024 01:09:39 UTC (4,980 KB)
[v2] Thu, 28 Mar 2024 02:37:27 UTC (5,230 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.