Scientists build highly accurate molecular water model using machine learning
While water is perceived to be one of the simplest substances in the world, modeling its behavior on the atomic or molecular level has frustrated scientists for decades. To date, no single model has been able to accurately represent the plethora of water's singular characteristics, including the fact that it is densest at a temperature slightly higher than its melting point.
A new study from the U.S. Department of Energy's (DOE) Argonne National Laboratory has achieved a breakthrough in the effort to mathematically represent how water behaves. To do so, Argonne researchers used machine learning to develop a new, computationally inexpensive water model that more accurately represents the thermodynamic properties of water, including how water changes to ice at the molecular scale.
In the study, researchers at Argonne's Center for Nanoscale Materials (CNM) used a machine learning workflow to optimize a new molecular model of water. They trained their model against extensive experimental data to generate a highly accurate molecular-scale model of water's properties. The CNM is a DOE Office of Science User Facility.
Optimizing model parameters for water has long been a challenge, and more than 50 different water models currently exist, according to Argonne nanoscientist Subramanian Sankaranarayanan, the study's corresponding author.
"We are trying to understand how to navigate the complex parameter space for any given model in order to capture a wide spectrum of water's properties, which is extremely difficult," Sankaranarayanan explained. "There is no existing model that can account for water's melting point, its density maximum and the density of ice, all at the same time."
Trying to create quantum mechanical or atomistic models to capture water's behavior had flummoxed researchers because they are so computationally intensive and still fail to reproduce many temperature-dependent properties of water. According to Henry Chan, Argonne postdoctoral researcher and the lead author of the study, this is even more difficult to achieve for simple models, such as the one used in this study.
For the researchers, the choice to use entire water molecules as the fundamental unit in the model allowed them to perform the simulation at low computational cost.
"While traditionally these simple models introduce a number of approximations and often suffer from poor accuracy, machine learning allows us to create a much more accurate model while maintaining simplicity," said University of Louisville assistant professor Badri Narayanan, a co-first author of the study.
However, even with this reduced computational expense, some physical properties can be difficult to simulate without large-scale supercomputers. The team used the Mira supercomputer at the Argonne Leadership Computing Facility, a DOE Office of Science User Facility, to perform simulations of up to 8 million water molecules to study the growth and formation of interfaces in polycrystalline ice.
According to co-first author and CNM assistant scientist Mathew Cherukara, this new model, termed "coarse-grained," achieves a fidelity on par with models that incorporate an atomic-level description. "Traditionally, you would think that introducing these approximations would typically result in a far worse model—one that's efficient but that does not perform very well," he said. "The beauty is that this molecular model has no right to be as accurate as the atomistic models, but still ends up being so."
To achieve the high accuracy of the coarse-grained model, the researchers trained the model using information drawn from nearly a billion atomic-scale configurations involving temperature-dependent properties that are well known. "Essentially, we said to our model, 'look, this is what the properties are,' and asked it to give us parameters that were able to reproduce them," Chan said.
Training the model involved what Chan called a "hierarchical approach," in which each candidate model was put through a series of tests or evaluations, starting with basic essential properties before working its way up to more complex ones. "You can think of it like trying to teach a child a skill," Chan said. "You start with something fundamental and work your way up once you see progress."
The researchers also showed that their approach could be used to improve the performance of other existing atomistic and molecular models. "We were able to significantly improve the performance of existing high-quality water models using our hierarchical approach. In principle, we should be able to revisit all molecular models and help each one of them attain their best performance," Sankaranarayanan said.
A paper based on the study, "Machine learning coarse grained models for water," appeared in the January 22 online issue of Nature Communications. Other Argonne authors included Chris Benmore, Stephen Gray, and Troy Loeffler.