We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Dynamical Systems

Title: Limiting dynamics for Q-learning with memory one in symmetric two-player, two-action games

Abstract: We develop a method based on computer algebra systems to represent the mutual pure strategy best-response dynamics of symmetric two-player, two-action repeated games played by players with a one-period memory. We apply this method to the iterated prisoner's dilemma, stag hunt and hawk-dove games and identify all possible equilibrium strategy pairs and the conditions for their existence. The only equilibrium strategy pair that is possible in all three games is the win-stay, lose-shift strategy. Lastly, we show that the mutual best-response dynamics are realized by a sample batch Q-learning algorithm in the infinite batch size limit.
Comments: 30 pages, 12 figures
Subjects: Dynamical Systems (math.DS); Adaptation and Self-Organizing Systems (nlin.AO)
Cite as: arXiv:2107.13995 [math.DS]
  (or arXiv:2107.13995v2 [math.DS] for this version)

Submission history

From: Janusz Meylahn [view email]
[v1] Thu, 29 Jul 2021 14:13:48 GMT (334kb,D)
[v2] Mon, 3 Oct 2022 13:38:02 GMT (22192kb,D)

Link back to: arXiv, form interface, contact.