A method for computer-aided modeling and simulation of large proteins and other biomolecules
Two computational scientists at Freie Universität Berlin are changing the way large proteins modeled inside computers by combining machine learning, an area of artificial intelligence, with statistical physics. The findings were published in Proceedings of the National Academy of Science.
"Although biological molecules such as proteins are too small to see with the naked eye, they consist of a large number of atoms," says Dr. Simon Olsson, Alexander von Humboldt fellow and lead author on the study. "This makes it technically challenging to study them to the extent necessary to understand how they work." Gaining insights into how proteins work is critical for several biomedical and biotechnological applications, including improving global food security, crop protection and fighting the rise of multi-resistant pathogens.
In their article, the authors describe a procedure to overcome the technical challenges of simulating large proteins. The key insight is realizing proteins are like social networks. Dr. Frank Noé, a professor at Freie Universität Berlin, says, "Proteins are known to be composed of multiple smaller building blocks—the right composition of these leads to the emergence of biological functions as we know it."
Traditionally, proteins are considered as a whole when simulated inside a computer, as this is how they are observed in experiment. However, their building blocks are small molecular switches, each of which can spontaneously change between multiple states. Understanding this switching behavior is important to understand how function emerges, and therefore, also important for applications.
"The issue is really that we will never be able to simulate all the possible configurations of these switches," Dr. Simon Olsson says. "There are just too many of them, they grow exponentially fast. Say one switch has two states, two switches can be in four settings, three switches in eight. Once you have 200 switches, the number of settings equals to the number of atoms in the known universe."
Reformulating the simulations to use the local building blocks and to learn how they are coupled breaks this unfavorable scaling and makes large protein simulations possible. This learning is done with methods of modern artificial intelligence (AI). Simon Olsson explains, "Although it appears more complicated to model many building blocks rather than just a single configurational state, it turns out that we can use ideas from AI to make computers learn a 'social network' of the building blocks and use this to understand their behavior."
Knowing this social network of the protein building blocks turns out to have several advantages. Dr. Frank Noé explains, "Determining this network does not require us to see all the possible configurations of the molecular system, yet once we have the network we can characterize them!" The protein social network distills the essentials about how proteins work, and thereby makes significant strides toward bringing down the computational footprint determining protein function.