We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Artificial Intelligence

Title: MiniZero: Comparative Analysis of AlphaZero and MuZero on Go, Othello, and Atari Games

Abstract: This paper presents MiniZero, a zero-knowledge learning framework that supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel AlphaZero, and Gumbel MuZero. While these algorithms have demonstrated super-human performance in many games, it remains unclear which among them is most suitable or efficient for specific tasks. Through MiniZero, we systematically evaluate the performance of each algorithm in two board games, 9x9 Go and 8x8 Othello, as well as 57 Atari games. For two board games, using more simulations generally results in higher performance. However, the choice of AlphaZero and MuZero may differ based on game properties. For Atari games, both MuZero and Gumbel MuZero are worth considering. Since each game has unique characteristics, different algorithms and simulations yield varying results. In addition, we introduce an approach, called progressive simulation, which progressively increases the simulation budget during training to allocate computation more efficiently. Our empirical results demonstrate that progressive simulation achieves significantly superior performance in two board games. By making our framework and trained models publicly available, this paper contributes a benchmark for future research on zero-knowledge learning algorithms, assisting researchers in algorithm selection and comparison against these zero-knowledge learning baselines. Our code and data are available at this https URL
Comments: Accepted by IEEE Transactions on Games
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2310.11305 [cs.AI]
  (or arXiv:2310.11305v3 [cs.AI] for this version)

Submission history

From: Hung Guei [view email]
[v1] Tue, 17 Oct 2023 14:29:25 GMT (4456kb,D)
[v2] Wed, 15 Nov 2023 05:49:17 GMT (4462kb,D)
[v3] Fri, 26 Apr 2024 09:16:15 GMT (4545kb,D)

Link back to: arXiv, form interface, contact.