Algorithm for predicting protein pairings could help show how living systems work

An algorithm which models how proteins inside cells interact with each other will enhance the study of biology, and sheds light on how proteins work together to complete tasks such as turning food into energy.

Researchers have developed an algorithm that aids our understanding of how living systems work, by identifying which proteins within cells will interact with each other, based on their genetic sequences alone.

The ability to generate huge amounts of data from genetic sequencing has developed rapidly in the past decade, but the trouble for researchers is in being able to apply that sequence data to better understand living systems. The new research, published in the journal Proceedings of the National Academy of Sciences, is a significant step forward because biological processes, such as how our bodies turn food into energy, are driven by specific protein-protein interactions.

"We were really surprised that our algorithm was powerful enough to make accurate predictions in the absence of experimentally-derived data," said study co-author Dr Lucy Colwell, from the University of Cambridge's Department of Chemistry, who led the study with Ned Wingreen of Princeton University. "Being able to predict these interactions will help us understand how proteins fit and work together to complete required tasks – and using an algorithm is much faster and much cheaper than relying on experiments."

When proteins interact with each other, they stick together to form protein complexes. In her previous research, Colwell found that if the two interacting proteins were known, sequence data could be used to figure out the structure of these complexes. Once the structure of the complexes is known, researchers can then investigate what is happening chemically. However, the question of which proteins interact with each other still required expensive, time-consuming experiments. Each cell often contains multiple versions of the same protein, and it wasn't possible to predict which version of each protein would interact specifically – instead, experiments involve trying all options to see which ones stick.

In the current paper, the researchers used a mathematical algorithm to sift through the possible interaction partners and identify pairs of proteins that interact with each other. The method correctly predicted 93% of protein-protein interactions present in a dataset of more than 40,000 protein sequences for which the pairing is known, without being first provided any examples of correct pairs.

When two proteins stick together, some amino acids on one chain stick to the amino acids on the other chain. The boundaries between interacting proteins tend to evolve together over time, causing their sequences to mirror each other.

The algorithm uses this effect to build a model of the interaction. It first randomly pairs protein versions within each organism – because interacting pairs tend to be more similar in sequence to one another than non-interacting pairs, the algorithm can quickly identify a small set of largely correct pairings from the random starting point.

Using this small set, the algorithm measures whether the amino acid at a particular location in the first protein influences which amino acid occurs at a particular location in the second protein. These dependencies, learned from the data, are incorporated into a model and used to calculate the interaction strengths for each possible protein pair. Low-scoring pairings are eliminated, and the remaining set used to build an updated model.

The researchers thought that the algorithm would only work accurately if it first 'learned' what makes a good protein-protein pair by studying pairs that have been discovered in experiments. This meant that the researchers had to give the algorithm some known protein pairs, or 'gold standards,' against which to compare new sequences. The team used two well-studied families of proteins, histidine kinases and response regulators, which interact as part of a signaling system in bacteria.

But known examples are often scarce, and there are tens of millions of undiscovered protein-protein interactions in cells. So the team decided to see if they could reduce the amount of training they gave the algorithm. They gradually lowered the number of known histidine kinase-response regulator pairs that they fed into the algorithm, and were surprised to find that the algorithm continued to work. Finally, they ran the algorithm without giving it any such training pairs, and it still predicted new pairs with 93 percent accuracy.

"The fact that we didn't need a set of training data was really surprising," said Colwell.

The algorithm was developed using proteins from bacteria, and the researchers are now extending the technique to other organisms. "Reactions in living organisms are driven by specific protein interactions," said Colwell. "This approach allows us to identify and probe these interactions, an essential step towards building a picture of how living systems work."

More information: Anne-Florence Bitbol et al. 'Inferring interaction partners from protein sequences.' Proceedings of the National Academy of Sciences (2016). DOI: 10.1073/pnas.1606762113

Journal information: Proceedings of the National Academy of Sciences

Provided by University of Cambridge

Algorithm for predicting protein pairings could help show how living systems work

New method to model protein interactions may accelerate drug development

Millions of gamers advance biomedical research by helping to reconstruct microbial evolutionary histories

Researchers discover new clues to how tardigrades can survive intense radiation

Scientists share single-cell atlas for the highly regenerative worm, Pristina leidyi

Researchers discover previously unknown gene that indirectly promotes photosynthesis in blue-green algae

New study details how starving cells hijack protein transport stations

Microbial food as a food production strategy of the future

Decoding the language of cells: Profiling the proteins behind cellular organelle communication

Most countries are struggling to meet climate pledges from 2009, emissions tracking study shows

A single atom layer of gold—researchers create goldene

SWOT satellite helps gauge the depth of Death Valley's temporary lake

NASA confirms mystery object that crashed through roof of Florida home came from space station

NASA is seeking a faster, cheaper way to bring Mars samples to Earth

Seed ferns experimented with complex leaf vein networks 201 million years ago, paleontologists find

Most massive stellar black hole in our galaxy found

New analysis reveals the brutal history of the Winchcombe meteorite's journey through space

Why European colonization drove the blue antelope to extinction

Bumblebees don't care about pesticide cocktails: Research highlights their resilience to chemical stressors

Nanovials method for immune cell screening uncovers receptors that target prostate cancer

Donate and enjoy an ad-free experience

Algorithm for predicting protein pairings could help show how living systems work

Let us know if there is a problem with our content

Thank you for taking time to provide your feedback to the editors

Donate and enjoy an ad-free experience

Share article

E-MAIL THE STORY