(Phys.org)—Science journalist, Nicola Nosengo has published a News Feature in the latest issue of the journal Nature, outlining the work being done to figure out how to use computers and databases to take on the tasks associated with discovering new and useful materials. In the same issue a team working at Haverford College, outlines a proposed method to use machine-learning-assisted materials discovery, via failed experiments.

Discovering for use in solving problems or to create new types of structures or devices, is notoriously difficult work—most in the field would describe it has haphazard, with most coming about at least partly by chance. Most often the process involves clearly defining a problem, e.g. noting that a certain type of battery should be able to hold a charge longer, than looking at all of the materials that have been discovered so far that fall into a certain category to see if any of them might fill the bill, and if that does not work, striking out into the unknown to see if there is a material that exists naturally in the world that has not yet been identified as a possibility. If that fails, the next step is to see if a new material can be made by combining other materials under various conditions, a process so fraught with difficulties that most simply do not bother, hoping that someone will stumble across a solution by accident sometime in the near future.

But, Nosengo points out, things do not have to go this way, why not use computers to do the looking for us, he asks, or perhaps even better, get them to discover new materials for us by virtual combining ingredients and virtually subjecting them to different conditions. Scientists are working on this idea, he notes—starting with building databases that hold information about the basic properties of already known materials, all subdivided into classes, such as those that have crystal structures (useful in battery making). He notes also that several groups have been working on developing algorithms to use such data, such as one called simply Intelligent Search. He notes also that the White House got involved back in 2011, by backing an initiative called the Materials Genome Initiative—which is based on the ideas used with the Human Genome Project approach.

As one example of an actual project, the team at Haverford demonstrated in their paper a new approach to developing algorithms to allow computers to use reaction data to predict reaction outcomes—a very necessary component of any large system dedicated to creating new out of basic components without guidance from humans.

More information: Paul Raccuglia et al. Machine-learning-assisted materials discovery using failed experiments, Nature (2016). DOI: 10.1038/nature17439

Abstract
Inorganic–organic hybrid materials such as organically templated metal oxides, metal–organic frameworks (MOFs) and organohalide perovskites have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table. Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation- and data-driven approaches (promoted by efforts such as the Materials Genome Initiative) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility, photovoltaic properties, gas adsorption capacity or lithium-ion intercalation) to identify promising target candidates for synthetic efforts; determination of the structure–property relationship from large bodies of experimental data, enabled by integration with high-throughput synthesis and measurement tools; and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification or gas adsorption properties). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on 'dark' reactions—failed or unsuccessful hydrothermal syntheses—collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully predicted conditions for new organically templated inorganic product formation with a success rate of 89 per cent. Inverting the machine-learning model reveals new hypotheses regarding the conditions for successful product formation.

Journal information: Nature