Accelerating Self-Play Learning in Go

Wu, David J.

Computer Science > Machine Learning

arXiv:1902.10565v2 (cs)

[Submitted on 27 Feb 2019 (v1), revised 1 Mar 2019 (this version, v2), latest version 9 Nov 2020 (v5)]

Title:Accelerating Self-Play Learning in Go

Authors:David J. Wu

View PDF

Abstract:By introducing several new Go-specific and non-Go-specific techniques along with other tuning, we accelerate self-play learning in Go. Like AlphaZero and Leela Zero, a popular open-source distributed project based on AlphaZero, our bot KataGo only learns from neural net Monte-Carlo tree-search self-play. With our techniques, in only a week with several dozen GPUs it achieves a likely strong pro or perhaps just-super-human level of strength. Compared to Leela Zero, we estimate a roughly 5x reduction in self-play computation required to achieve that level of strength, as well as a 30x to 100x reduction for reaching moderate to strong amateur levels. Although we so far have not tested in longer runs, we believe that our techniques hold promise for future research.

Comments:	38 pages including appendices, 13 figures, 5 tables. Corrected typo in PUCT formula and an incorrect attribution in a footnote, minor notational clarifications
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1902.10565 [cs.LG]
	(or arXiv:1902.10565v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1902.10565

Submission history

From: David Wu [view email]
[v1] Wed, 27 Feb 2019 14:51:51 UTC (1,576 KB)
[v2] Fri, 1 Mar 2019 17:45:10 UTC (1,576 KB)
[v3] Tue, 17 Sep 2019 00:40:26 UTC (925 KB)
[v4] Thu, 6 Feb 2020 15:30:15 UTC (926 KB)
[v5] Mon, 9 Nov 2020 18:17:55 UTC (925 KB)

Computer Science > Machine Learning

Title:Accelerating Self-Play Learning in Go

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Accelerating Self-Play Learning in Go

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators