It took nearly a century of trial and error for human scientists to organize the periodic table of elements, arguably one of the greatest scientific achievements in chemistry, into its current form.
A new artificial intelligence (AI) program developed by Stanford physicists accomplished the same feat in just a few hours.
Called Atom2Vec, the program successfully learned to distinguish between different atoms after analyzing a list of chemical compound names from an online database. The unsupervised AI then used concepts borrowed from the field of natural language processing – in particular, the idea that the properties of words can be understood by looking at other words surrounding them – to cluster the elements according to their chemical properties.
"We wanted to know whether an AI can be smart enough to discover the periodic table on its own, and our team showed that it can," said study leader Shou-Cheng Zhang, the J. G. Jackson and C. J. Wood Professor of Physics at Stanford's School of Humanities and Sciences.
Zhang says the research, published in the June 25 issue of Proceedings of the National Academy of Sciences, is an important first step toward a more ambitious goal of his, which is designing a replacement to the Turing test – the current gold standard for gauging machine intelligence.
In order for an AI to pass the Turing test, it must be capable of responding to written questions in ways that are indistinguishable from a human. But Zhang thinks the test is flawed because it is subjective. "Humans are the product of evolution and our minds are cluttered with all sorts of irrationalities. For an AI to pass the Turing test, it would need to reproduce all of our human irrationalities," Zhang said. "That's very difficult to do, and not a particularly good use of programmers' time."
Zhang would instead like to propose a new benchmark of machine intelligence. "We want to see if we can design an AI that can beat humans in discovering a new law of nature," he said. "But in order to do that, we first have to test whether our AI can make some of the greatest discoveries already made by humans."
By recreating the periodic table of elements, Atom2Vec has achieved this secondary goal, Zhang says.
Potassium is to king as …
Zhang and his group modeled Atom2Vec on an AI program that Google engineers created to parse natural language. Called Word2Vec, the language AI works by converting words into numerical codes, or vectors. By analyzing the vectors, the AI can estimate the probability of a word appearing in a text given the co-occurrence of other words.
For example, the word "king" is often accompanied by "queen," and "man" by "woman." Thus, the mathematical vector of "king" might be translated roughly as "king = a queen minus a woman plus a man."
"We can apply the same idea to atoms," Zhang said. "Instead of feeding in all of the words and sentences from a collection of texts, we fed Atom2Vec all the known chemical compounds, such as NaCl, KCl, H20, and so on."
From this sparse data, the AI program figured out, for example, that potassium (K) and sodium (Na) must have similar properties because both elements can bind with chlorine (Cl). "Just like king and queen are similar, potassium and sodium are similar," Zhang said.
Zhang hopes that in the future, scientists can harness Atom2Vec's knowledge to discover and design new materials. "For this project, the AI program was unsupervised, but you could imagine giving it a goal and directing it to find, for example, a material that is highly efficient at converting sunlight to energy," Zhang said.
His team is already at work on version 2.0 of their AI program, which will focus on cracking an intractable problem in medical research: designing just the right antibody to attack antigens – molecules capable of inducing an immune response – that are specific to cancer cells. Currently, one of the most promising approaches to curing cancer is cancer immunotherapy, which involves harnessing the antibodies that can attack antigens on cancer cells.
But the human body can produce more than 10 million unique antibodies, each of which is made up of a different combination of about 50 genes. "If we can map these building block genes onto a mathematical vector, then we can organize all antibodies into something similar to a periodic table," Zhang says. "Then, if you discover that one antibody is effective against an antigen but is toxic, you can look within the same family for another antibody that is just as effective but less toxic."
Explore further: Speakers store abstract information, irrespective of their language
Quan Zhou el al., "Atom2Vec: Learning atoms for materials discovery," PNAS (2018). www.pnas.org/cgi/doi/10.1073/pnas.1801181115