Skip to main content

DeepMind’s AI became a superhuman chess player in a few hours, just for fun

DeepMind’s AI became a superhuman chess player in a few hours, just for fun

/

The descendant of DeepMind’s world champion Go program stretches its muscles in a new domain

Share this story

2016 World Chess Championship - November 12
Photo by Jason Kempin/Getty Images for Agon Limited

The end-game for Google’s AI subsidiary DeepMind was never beating people at board games. It’s always been about creating something akin to a combustion engine for intelligence — a generic thinking machine that can be applied to a broad range of challenges. The company is still a long way off achieving this goal, but new research published by its scientists this week suggests they’re at least headed down the right path.

In the paper, DeepMind describes how a descendant of the AI program that first conquered the board game Go has taught itself to play a number of other games at a superhuman level. After eight hours of self-play, the program bested the AI that first beat the human world Go champion; and after four hours of training, it beat the current world champion chess-playing program, Stockfish. Then for a victory lap, it trained for just two hours and polished off one of the world’s best shogi-playing programs named Elmo (shogi being a Japanese version of chess that’s played on a bigger board).

For each game, the AI program taught itself how to play

One of the key advances here is that the new AI program, named AlphaZero, wasn’t specifically designed to play any of these games. In each case, it was given some basic rules (like how knights move in chess, and so on) but was programmed with no other strategies or tactics. It simply got better by playing itself over and over again at an accelerated pace — a method of training AI known as “reinforcement learning.”

Using reinforcement learning in this way isn’t new in and of itself. DeepMind’s engineers used the same method to create AlphaGo Zero; the AI program that was unveiled this October. But, as this week’s paper describes, the new AlphaZero is a “more generic version” of the same software, meaning it can be applied to a broader range of tasks without being primed beforehand.

What’s remarkable here is that in less than 24 hours, the same computer program was able to teach itself how to play three complex board games at superhuman levels. That’s a new feat for the world of AI.

This takes DeepMind just that little bit closer to building the generic thinking machine the company dreams of, but major challenges lie ahead. When DeepMind CEO Demis Hassabis showed off AlphaGo Zero earlier this year, he suggested that a future version of the program could help with a range of scientific problems — from designing new drugs to discovering new materials. But these problems are qualitatively very different to just playing board games, and a whole lot of work needs to be done to find out how exactly algorithms can tackle them. All we can say for certain now, is that artificial intelligence has definitely moved on from just playing chess.