Befuddling AI Go Systems: MIT, UC Berkeley & FAR AI’s Adversarial Policy Achieves a >99% Win Rate Against KataGo

Supplemental learning from the game’s self-performance has enabled AI agents to surpass human-level expert performance in the popular Dota game and board games like chess and Go. Despite the strong performance, recent comments have suggested that the game may not be as robust as previously thought. The question naturally arises: Are such agents vulnerable to self-play against adversary attacks?

On a new paper Opponents of Policy Beat Professional-Level Go AIs, a research team from MIT, UC Berkeley, and LONG AI uses a new adversary strategy to attack the state-of-the-art AI Go system KataGo. Their team believes it is the first successful end-to-end attack against an AI Go system playing at a professional human level.

The main team summarizes their contributions as follows:

  1. We proposed a new approach, and the attack Gleave et al. (2020) and AlphaZero-style training (Silver et al., 2018).
  2. We demonstrate that adversarial strategies are against the state-of-the-art Go AI system, KataGo.
  3. We find that the adversary pursues a simple strategy, which is to fool the victim into predicting victory, so that it may happen prematurely.
Also Read :  Final Fantasy XVI launches June 22, 2023

This work aims to exploit the professional level of AI Go behavior with a discrete area of ​​activity. The team is targeting the strongest publicly available AI Go system, KataGo, although not at its full power setting. Unlike KataGo, which is practiced during a game, the team trained their agent in games against a specific victim agent, using only information from the turn where the opponent’s movement is. This “victim game” approach to education encourages the model to oppress the victim, not imitate it.

Also Read :  Just do it! An interview with Salesforce's Lyndsey McGonnell

The team also introduces two distinct families of adversarial Monte Carlo search trees (A-MCTS) — Sample (A-MCTS-S) and recursive (A-MCTS-R) — to prevent the agent model from moving in its design. decorated Rather than using random initialization, the team uses a career path that executes the agent against successively stronger versions of the victim.

In empirical studies, the opposing team used their strategy to attack KataGo without research (level 100 European player), and 64-visit KataGo (“near superhuman level”). The proposed strategy achieved more than 99 percent win rate without searching and more than 50 percent win rate against 64-visit KataGo.

While this work suggests that self-learning is not as robust as expected and that adversarial strategies can be used to top Go AI systems, the results have been questioned by the machine learning and Go communities. Reddit’s discussions involving the authors of the paper and the developers of KataGo have noted the particulars of the Tromp-Taylor scoring system used in the experiments – while the proposed agent wins by “tricking KataGo into ending the game prematurely”, it is argued that this technique will lead to damages more common Go rulesets.

Also Read :  Why the Math Around Adaptive AI is Painful

The open source implementation is on GitHub, and game samples are available on the project’s web page. paper Opponents of Policy Beat Professional-Level Go AIs it is on arXiv.

Author: Hecate | editor: Michael Sarazen

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly such as weekly AI updates.


Leave a Reply

Your email address will not be published.

Related Articles

Back to top button