The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games

Levi, Jake; Lu, Chris; Willi, Timon; de Witt, Christian Schroeder; Foerster, Jakob

Full-text links:

Download:

Current browse context:

cs.GT

< prev | next >

new | recent | 2402

Computer Science > Computer Science and Game Theory

Title: The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games

Authors: Jake Levi, Chris Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster

(Submitted on 2 Feb 2024 (v1), last revised 28 Mar 2024 (this version, v2))

Abstract: The increasing prevalence of multi-agent learning systems in society necessitates understanding how to learn effective and safe policies in general-sum multi-agent environments against a variety of opponents, including self-play. General-sum learning is difficult because of non-stationary opponents and misaligned incentives. Our first main contribution is to show that many recent approaches to general-sum learning can be derived as approximations to Stackelberg strategies, which suggests a framework for developing new multi-agent learning algorithms. We then define non-coincidental games as games in which the Stackelberg strategy profile is not a Nash Equilibrium. This notably includes several canonical matrix games and provides a normative theory for why existing algorithms fail in self-play in such games. We address this problem by introducing Welfare Equilibria (WE) as a generalisation of Stackelberg Strategies, which can recover desirable Nash Equilibria even in non-coincidental games. Finally, we introduce Welfare Function Search (WelFuSe) as a practical approach to finding desirable WE against unknown opponents, which finds more mutually desirable solutions in self-play, while preserving performance against naive learning opponents.

Comments:	31 pages, 23 figures
Subjects:	Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)
Cite as:	arXiv:2402.01088 [cs.GT]
	(or arXiv:2402.01088v2 [cs.GT] for this version)

Submission history

From: Jake Levi [view email]
[v1] Fri, 2 Feb 2024 01:09:39 GMT (4980kb,D)
[v2] Thu, 28 Mar 2024 02:37:27 GMT (5230kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2402.01088

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Science and Game Theory

Title: The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games

Submission history