Study of protein structures reveals key events in evolutionary history
A new study of proteins, the molecular machines that drive all life, also sheds light on the history of living organisms.
The study, in the journal Structure, reveals that after eons of gradual evolution, proteins suddenly experienced a "big bang" of innovation. The active regions of many proteins, called domains, combined with each other or split apart to produce a host of structures that had never been seen before. This explosion of new forms coincided with the rapidly increasing diversity of the three superkingdoms of life (bacteria; the microbes known as archaea; and eucarya, the group that includes animals, plants, fungi and many other organisms).
Lead author Gustavo Caetano-Anollés, a professor of bioinformatics in the department of crop sciences at the University of Illinois and an affiliate of the Institute for Genomic Biology, has spent years studying protein structures - he calls them "architectures" - which he suggests offer a reliable record of evolutionary events.
All proteins contain domains that can be identified by their structural and functional similarities to one another. These domains are the gears and motors that allow the protein machinery to work. Every protein has one or more of them, and very different proteins can contain the same, or similar, domains.
By conducting a census of all the domains that appear in different groups of organisms and comparing the protein repertoires of hundreds of different groups, the researchers were able to construct a timeline of protein evolution that relates directly to the history of life.
"The history of the protein repertoire should match the history of the entire organism because the organism is made up of all those pieces," Caetano-Anollés said.
He and his co-author, postdoctoral researcher Minglei Wang, were interested in tracing how proteins make use of their domains, or groups of domains, to accomplish various tasks. These domains or domain clusters can be thought of as "modules" which fit together in various ways to achieve different ends.
Unlike the sequence of amino acids in a protein, which is highly susceptible to change, the protein modules found today in living organisms have endured because they perform critical tasks that are beneficial to the organisms that host them, Caetano-Anollés said.
"These modules are resistant to change, they are highly integrated and they are used in different contexts," he said.
By tracing the history of the modules, the researchers were able to build a rough timeline of protein evolution. It revealed that before the three superkingdoms began to emerge, most proteins contained only single domains that performed a lot of tasks.
"As time progressed, these domains started to combine with others and they became very specialized," Caetano-Anollés said. This eventually led to the big bang of protein architectures.
"Exactly at the time of the big bang," he said, many of the combined domains began to split apart, creating numerous single-domain modules again. But these new modules were much more efficient and specialized than their ancient predecessors had been.
"This makes a lot of sense," Caetano-Anollés said. "As you become more complex, you would want to fine-tune things, to do things in a more tailored way."
The protein modules of the three superkingdoms also began to diverge more dramatically from one another, with the eucarya (the group that includes plants and animals) hosting the greatest diversity of modules.
"This explosion of diversity allowed the eucarya to do things with their proteins that other organisms could not do," Caetano-Anollés said.