Self-taught, 'superhuman' AI now even smarter: makers

October 18, 2017 by Mariëtte Le Roux
There are an astonishing 10 to the power of 170 possible board configurations in Go - more than the number of atoms in the known universe. Credit: DeepMind

The computer that stunned humanity by beating the best mortal players at a strategy board game requiring "intuition" has become even smarter, its makers said Wednesday.

Even more startling, the updated version of AlphaGo is entirely self-taught—a major step towards the rise of machines that achieve superhuman abilities "with no human input", they reported in the science journal Nature.

Dubbed AlphaGo Zero, the Artificial Intelligence (AI) system learnt by itself, within days, to master the ancient Chinese board known as "Go"—said to be the most complex two-person challenge ever invented.

It came up with its own, novel moves to eclipse all the Go acumen humans have acquired over thousands of years.

After just three days of self-training it was put to the ultimate test against AlphaGo, its forerunner which previously dethroned the top human champs.

AlphaGo Zero won by 100 games to zero.

"AlphaGo Zero not only rediscovered the common patterns and openings that humans tend to play... it ultimately discarded them in preference for its own variants which humans don't even know about or play at the moment," said AlphaGo lead researcher David Silver.

The 3,000-year-old Chinese game played with black and white stones on a board has more move configurations possible than there are atoms in the Universe.

AlphaGo made world headlines with its shock 4-1 victory in March 2016 over 18-time Go champion Lee Se-Dol, one of the game's all-time masters.

Lee's defeat showed that AI was progressing faster than widely thought, said experts at the time who called for rules to make sure powerful AI always remains completely under human control.

In May this year, an updated AlphaGo Master programme beat world Number One Ke Jie in three matches out of three.

Not constrained by humans

Unlike its predecessors which trained on data from thousands of human games before practising by playing against itself, AlphaGo Zero did not learn from humans, or by playing against them, according to researchers at DeepMind, the British (AI) company developing the system.

"All previous versions of AlphaGo... were told: 'Well, in this position the human expert played this particular move, and in this other position the human expert played here'," Silver said in a video explaining the advance.

AlphaGo Zero skipped this step.

Instead, it was programmed to respond to reward—a positive point for a win versus a negative point for a loss.

Starting with just the rules of Go and no instructions, the system learnt the game, devised strategy and improved as it competed against itself—starting with "completely random play" to figure out how the reward is earned.

This is a trial-and-error process known as "reinforcement learning".

Unlike its predecessors, AlphaGo Zero "is no longer constrained by the limits of human knowledge," Silver and DeepMind CEO Demis Hassabis wrote in a blog.

Amazingly, AlphaGo Zero used a single machine—a human brain-mimicking "neural network"—compared to the multiple-machine "brain" that beat Lee.

It had four data processing units compared to AlphaGo's 48, and played 4.9 million training games over three days compared to 30 million over several months.

Beginning of the end?

"People tend to assume that machine learning is all about big data and massive amounts of computation but actually what we saw with AlphaGo Zero is that algorithms matter much more," said Silver.

The findings suggested that AI based on reinforcement learning performed better than those that rely on human expertise, Satinder Singh of the University of Michigan wrote in a commentary also carried by Nature.

"However, this is not the beginning of any end because AlphaGo Zero, like all other successful AI so far, is extremely limited in what it knows and in what it can do compared with humans and even other animals," he said.

AlphaGo Zero's ability to learn on its own "might appear creepily autonomous", added Anders Sandberg of the Future of Humanity Institute at Oxford University.

But there was an important difference, he told AFP, "between the general-purpose smarts humans have and the specialised smarts" of computer software.

"What DeepMind has demonstrated over the past years is that one can make software that can be turned into experts in different domains... but it does not become generally intelligent."

It was also worth noting that AlphaGo was not programming itself, said Sandberg.

"The clever insights making Zero better was due to humans, not any piece of software suggesting that this approach would be good. I would start to get worried when that happens."

Explore further: Google's AlphaGo retires on top after humbling world No. 1

More information: Mastering the game of Go without human knowledge, Nature (2017). nature.com/articles/doi:10.1038/nature24270

Related Stories

AI wins as Google algorithm beats No. 1 Go player (Update)

May 23, 2017

Google's computer algorithm AlphaGo narrowly beat the world's top-ranked player in the ancient Chinese board game of Go on Tuesday, reaffirming the arrival of what its developers tout as a ground-breaking new form of artificial ...

Where does AlphaGo go?

August 26, 2016

On March 15, 2016, Lee Sodol, an 18-time world champion of the ancient Chinese board game Go, was defeated by AlphaGo, a computer program. The event is one of the most historic in the field of artificial intelligence since ...

Recommended for you

Permanent, wireless self-charging system using NIR band

October 8, 2018

As wearable devices are emerging, there are numerous studies on wireless charging systems. Here, a KAIST research team has developed a permanent, wireless self-charging platform for low-power wearable electronics by converting ...

Facebook launches AI video-calling device 'Portal'

October 8, 2018

Facebook on Monday launched a range of AI-powered video-calling devices, a strategic revolution for the social network giant which is aiming for a slice of the smart speaker market that is currently dominated by Amazon and ...

Artificial enzymes convert solar energy into hydrogen gas

October 4, 2018

In a new scientific article, researchers at Uppsala University describe how, using a completely new method, they have synthesised an artificial enzyme that functions in the metabolism of living cells. These enzymes can utilize ...

13 comments

Adjust slider to filter visible comments by rank

Display comments: newest first

Eikka
2.3 / 5 (3) Oct 18, 2017
"People tend to assume that machine learning is all about big data and massive amounts of computation but actually what we saw with AlphaGo Zero is that algorithms matter much more," said Silver.


4.9 million training games isn't big data and massive amounts of computation?

If a person were to play 8 hours of Go each day for 10 years, assuming a single game takes 30 minutes, they would only complete 60,000 games. The computer has the equivalent of 800 years of human time to explore the problem space, which makes the task easy to even the dumbest evolutionary/reinforcement algorithm. You can try almost everything at random to see if it works.
Hyperfuzzy
3.5 / 5 (2) Oct 18, 2017
Wait, are you comparing Logic to Intuition?
KBK
1.2 / 5 (5) Oct 18, 2017
The algorithms required for it to self learn in a 'human' fashion were created and announced openly, two decades ago, as the internet was coming on line.

That program went black.

Proving that would involved telling you who, what, when, where, why, how, and the algorithmic aspects.

For your safety and future prospects of humanity.. I won't got that far, at any price. Then you would know too, and that's not a good thing. Less is more, in this case. More life.

It's not ego, here. You can think it's ego if you want, but it is true, frighteningly true.

Just thought it was a good idea for you (the given reader) to know this.
SwamiOnTheMountain
1.2 / 5 (6) Oct 18, 2017
I'm still shocked that they think GO is a difficult game. Having a lot of configurations doesn't make it a difficult game. It just makes it so a computer can't cheat and just do a brute force attack.

Also it's not very impressive that a computer that's played millions of games is better than a human that's played thousand of games. It's not faster at learning, it's faster at playing.
What happens when the computer has a game that it cannot play 20 games per second? It would be screwed trying to learn via this method.

Though if the goal is getting an AI to play a very simple abstract game really well, then they have done it. The AIs in video games are still bad.
Hyperfuzzy
5 / 5 (1) Oct 18, 2017
It's algorithmic by a human, either worse or better. Stop acting like AI has a mind of its own!
Whydening Gyre
not rated yet Oct 18, 2017
It's algorithmic by a human, either worse or better. Stop acting like AI has a mind of its own!

I call it -
(A)lgorythmic (I)ndexing
Spaced out Engineer
1 / 5 (1) Oct 19, 2017
"People tend to assume that machine learning is all about big data and massive amounts of computation but actually what we saw with AlphaGo Zero is that algorithms matter much more," said Silver.

4.9 million training games isn't big data and massive amounts of computation?

The number of possible go games is at least 10^(10^48). But the amount of information passed by the rules and configuration of the board may constrain global maxima trajectories to a much smaller space. If only they could get some metric. Integrated information maybe valid, if features/heuristics can not be extracted from the distributed conv neural net. Clearly in the hierarchy, lower level entities define higher level properties. But is it all symmetric dependencies and a homeostasis of a perceived function or does it posses meta-process representation?
Algorithmic Indexing is a good term if there is no modularity, plasticity, and casual emergence.
antialias_physorg
5 / 5 (3) Oct 19, 2017
What I wonder is having that program as a learning aid, can humans play a lot better game of Go than previous generations?

AIs are probably even better for learning than the standard chess programs. With an AI you play against a learning/adapting opponent. I.e. it will soon resort to focussing on your weaknesses to defeat you - which gives you a chance to concentrate on eliminating those from your play.
(for a similar experiment in the realm of poker google for "Heads up Texas Hold'em AI challenge"...where the AI wiped the floor with top pros)

Now there is a caveat: Playing an AI improves your game against a similar AI. Humans may tend to think in different patterns. So training with an AI might not increase your chances against a human (though likely it will)
Eikka
1 / 5 (2) Oct 19, 2017
The number of possible go games is at least 10^(10^48).


Of course most of those games will be trivial or just permutations of the same game.

But the point is that the algorithm only needed to beat the human in the actual game, not prove that it's smarter than a human. In that sense the game is stacked against the humans, because the computer is allowed much more training, and because it has enough time to explore it is very likely to just stumble upon novel strategies by accident and beat the person by a lucky trick.

When you set an evolutionary algorithm loose on a problem with plenty of time to spend, and call it AI, you're arguing that the million monkeys with infinite time on a typewriter are smarter than Shakespeare.
Robert_D
5 / 5 (5) Oct 19, 2017
I've followed AI and games like chess and Go for decades. This is NOT a matter of just a lot of computation stumbling on good moves. Chess programs finally beat humans with the alpha-beta algorithm and brute force, but that method couldn't work for Go, the move tree expands too quickly. This new approach is a real breakthrough. Go requires pattern recognition and it seemed machines couldn't match the pattern recognition of humans, but this new approach has breached that barrier.
Hyperfuzzy
1 / 5 (2) Oct 19, 2017
Because you don't understand it doesn't suggest anything. I hate Go, that's a human response. If I were a digital engine, no hatred, just define the correct response to every move. The tree only has so many correct responses. Just because it's big doesn't deny anything. The universe is big; but, only has 2 things, one thing if you get it. We created the nonsense of quarks and GR. Now we're doing it with AI. By the way this is not AI, it's tic tac toe. AI will listen to you and offer suggestions, juz say'n, even explain why Go and Chess is not AI. With AI, AI writes its own feedback. So stupidity is very dangerous!
Hyperfuzzy
1 / 5 (1) Oct 19, 2017
It's algorithmic by a human, either worse or better. Stop acting like AI has a mind of its own!

I call it -
(A)lgorythmic (I)ndexing

Call it what you will. Artificial Intelligence will be Intelligence and we will be Artificial! Multiple ways to define it, so ... Too many?
Whydening Gyre
not rated yet Oct 20, 2017
"Self-taught, 'superhuman' AI now even smarter:" (says the) "makers".
Think about THAT for a minute or two...:-)

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.