Search results
Results From The WOW.Com Content Network
Decommissioned AlphaGo backend rack. Go is considered much more difficult for computers to win than other games such as chess, because its strategic and aesthetic nature makes it hard to directly construct an evaluation function, and its much larger branching factor makes it prohibitively difficult to use traditional AI methods such as alpha–beta pruning, tree traversal and heuristic search.
AlphaGo's techniques are probably less useful in domains that are difficult to simulate, such as learning how to drive a car. [17] DeepMind stated in October 2017 that it had already started active work on attempting to use AlphaGo Zero technology for protein folding, and stated it would soon publish new findings. [18] [19]
In May 2017, AlphaGo beat Ke Jie, who at the time was ranked top in the world, [29] [30] in a three-game match during the Future of Go Summit. [31] In October 2017, DeepMind revealed a new version of AlphaGo, trained only through self play, that had surpassed all previous versions, beating the Ke Jie version in 89 out of 100 games. [32]
AlphaZero (AZ) is a more generalized variant of the AlphaGo Zero (AGZ) algorithm, and is able to play shogi and chess as well as Go. Differences between AZ and AGZ include: [2] AZ has hard-coded rules for setting search hyperparameters. The neural network is now updated continually. AZ doesn't use symmetries, unlike AGZ.
The network used in KataGo are ResNets with pre-activation.. While AlphaGo Zero has only game board history as input features (as it was designed as a general architecture for board games, subsequently becoming AlphaZero), the input to the network contains additional features designed by hand specifically for playing Go.
Master is a version of DeepMind's Go software AlphaGo, named after the account name (originally Magister/Magist) used online, which won 60 straight online games against human professional Go players from 29 December 2016 to 4 January 2017.
Leela Zero is an (almost) exact replication of AlphaGo Zero in both training process and architecture. [13] The training process is Monte-Carlo Tree Search with self-play, exactly the same as AlphaGo Zero. The architecture is the same as AlphaGo Zero (with one difference). Consider the last released model, 0e9ea880.
MuZero (MZ) is a combination of the high-performance planning of the AlphaZero (AZ) algorithm with approaches to model-free reinforcement learning. The combination allows for more efficient training in classical planning regimes, such as Go, while also handling domains with much more complex inputs at each stage, such as visual video games.