Each cell contains thousands of proteins, each one of which bears a unique signature. All proteins, distinct in shape and function, are built from the same amino acid strings. Many proteins are vital, as evidenced by the plethora of diseases linked to their absence or malfunction. But how exactly did proteins first come to be? Do they all share a single common ancestor? Or did proteins evolve from many different origins?
Forming a global picture of the protein universe is crucial to addressing these and other important questions, but it's nearly impossible to do. Such a bird's-eye view demands comparisons of nearly innumerable pairs of known and unknown proteins. Now, new research published in the journal PNAS by Prof. Nir Ben-Tal of the Department of Biochemistry and Molecular Biology at Tel Aviv University's Faculty of Life Sciences, Prof. Rachel Kolodny of University of Haifa's Department of Computer Science, and Dr. Sergey Nepomnyachiy of New York University's Polytechnic Institute, is providing a first step toward piecing together a global picture of the protein universe.
"This is the first study that combines sequence and shape similarity between proteins within the context of networks to provide a bird's eye view of the protein universe," said Prof. Ben-Tal. "The network offers a natural way to organize and search among all proteins. It could be used to theorize about protein evolution, suggest evolutionary pathways, and even suggest strategies for the design of new proteins."
A master of their domain
Conveniently, proteins are comprised of various combinations of domains - conserved and commonly occurring parts that can function on their own; it is therefore sufficient to analyze relationships among these. The researchers studied the evolutionary relationships among a representative set of 9,710 domains. They compared them, searching for common motifs. The motif includes parts of each of the two compared domains, and can therefore indicate an evolutionary relationship among them. The researchers presented their results as a series of networks, in which edges connect domains with a shared motif.
According to their analysis of protein pairs, the researchers revealed a truly complex picture of protein space—a large, connected component with many isolated "islands."
"The protein network can be interpreted as a collection of evolutionary paths in protein space," said Prof. Ben-Tal. "Paths in the major connected component of the network include many domains, and demonstrate the sequence and shape resemblance between them. The large number of paths within the major connected component suggest it is particularly easy to add and delete motifs in the continuous region of protein space without impeding stability. Apparently, evolution took advantage of this property to design new proteins with novel functions."
The researchers are currently working on ways of supplementing the study with data on protein function (such as DNA/RNA binding), its role in disease pathology, and drug binding to individual proteins.
Explore further: Protein evolution follows a modular principle