An international team of researchers, including several from MIT, has developed a computational model that helps identify relationships between proteins and the enzymes that regulate them.
The work could help researchers understand the complex protein networks that influence human disease, including cancer. The researchers report their findings in the cover story of the June 29 issue of Cell.
The new method, known as NetworKIN, can trawl through existing research data and use it to illuminate protein networks that control cellular processes. It focuses on enzymes called kinases, which are involved in many cell signaling pathways, including repair of DNA damage that can lead to cancer.
The model was developed by researchers from MIT, the Samuel Lunenfeld Research Institute of Mount Sinai Hospital in Canada and the European Molecular Biology Laboratory in Germany.
NetworKIN "gives us the tools to take the information we already have and begin to build a map of the kinase signaling pathways within the cells," said Michael Yaffe, MIT associate professor of biology and biological engineering, a member of MIT's Center for Cancer Research and one of the authors of the paper.
"By getting a network-wide view, multiple aberrant genes of kinase-controlled processes are more easily targeted," said Rune Linding, a visiting scientist at MIT's Center for Cancer Research, postdoctoral fellow at the Samuel Lunenfeld Research Institute and one of the lead authors of the paper. "In the future, complex human diseases will be treated by targeting multiple genes."
Kinases act by phosphorylating, or adding a phosphate group, to a protein. That signal tells a protein what it should be doing. Yaffe estimated that at any one time, 30 to 50 percent of the proteins in a cell are phosphorylated.
Because kinases play such a critical role in cellular processes, including DNA repair and cell division, scientists have been working to identify where phosphorylation takes place in a target protein. Mass spectrometry makes it easy to identify those sites, but until now there has been no good way to figure out which kinases are acting on each site, Yaffe said.
"It's a huge bottleneck," he said. "We're getting thousands of phosphorylation sites, but we don't know which kinase phosphorylated them, so we don't know what pathway to put them in."
To solve that problem, the researchers developed a two-step approach.
In the first step, they used a pair of previously developed computer programs that can analyze the amino acid sequence of the phosphorylation site and predict which family of kinases is most likely to bind to and phosphorylate it.
However, each family includes several kinases, and the sequence alone cannot tell you which one acts on the site.
To pinpoint the kinases more accurately, the researchers developed a computational model that analyzes databases that contain information about signaling pathways and protein interactions. The program also performs "text mining" of published articles and abstracts to search for reported protein-kinase interactions.
By combining these two sources of information--sequences of the target proteins and contextual information about the interaction between proteins and kinases--the computational model can develop a detailed network that would be very difficult to create by manually examining the available data.
"The sequence gets us into the ballpark, but it's all of this contextual information that helps us figure out specifically which kinases are acting on which sites," said Yaffe, who is also affiliated with the Broad Institute of MIT and Harvard, and Beth Israel Deaconess Medical Center.
Other MIT authors on the paper are Gerald Ostheimer, a postdoctoral fellow in biological engineering, Marcel van Vugt, a postdoctoral fellow at the Center for Cancer Research, and Leona Samson, director of the Center for Environmental Health Sciences and professor of biology and biological engineering.
Explore further: Illuminating the dark side of the genome