Scientists deconstruct one of the myths of biological innovation
While the number of coding genes that produce proteins in humans has dwindled to 20,000 in recent years, scientists think that the dimensions of the proteome could be larger. This diversity of proteins has become one of the main sources of complexity in mammals, including humans.
This theory has been challenged by a study headed by the researcher Michael Tress from Alfonso Valencia's group at the Spanish National Cancer Research Centre (CNIO), published today in the journal Trends in Biochemical Sciences (TIBS). According to the researchers, most genes actually produce a single dominant protein. These results require the reassessment of the origin and source of biological adaptation that led to the emergence of primates 50 million years ago or to the development of the human brain, for example.
Resizing the human protein map
"The diminishing human genome" is how Valencia described the continuous corrections to the annotations of the human genome more than two years ago. His team set the number of genes at around 19,000. Can something as complex as a human being be built from such a small number of genes?
Skeptical researchers have turned their attention to the proteome as a possible source of biological innovation. Each gene can produce dozens or hundreds of RNAs, which result from combinations of various portions of a gene through alternative splicing. Then the RNAs are translated into proteins. That is why alternative splicing has been identified as an important source of protein diversity.
In this chain from the gene to RNA and from RNA to a protein, the authors of the paper realized there was a vast difference between the number of RNAs or transcripts of the order of hundreds of thousands in humans, while the number of proteins, quantified experimentally, amounted to little more than 12,000. "The problem is that the huge number of transcripts led us to assume there is a larger number of proteins, but their presence within the cells has never been demonstrated," explains Michael Tress, principal investigator on the project.
"One gene, one protein, or one gene, several proteins?" the researchers asked in their study. To answer this question, they conducted a comprehensive meta-analysis compiling data from eight large-scale experiments and from proteins or human peptide databases. The data came from a wide range of tissues, cell lines and from different development stages.
The results show that while there are many alternative variants of RNAs from a single gene, only a few genes (246, slightly more than 1 percent of the human genome) presented clear evidence of producing more than one protein. "Most genes produce a single dominant protein. This tells us that alternative splicing is not essential for the complexity of the proteome," explains Tress. According to the authors, when alternative splicing takes place, it generates highly conserved proteins, with evolutionary origins that can go back more than 500 million years and with very subtle changes in their structure and function.
Predicting the consequences of different genetic variants
These observations may have significant implications in biomedicine, particularly in predicting the effects of genetic variants or mutations in the body. The team suggests that only the mutations in the DNA that have an impact on the dominant proteins will be detrimental.
Despite the limited evidence of alternative splicing in healthy cells, the situation is different with diseases such as cancer, in which the process plays a fundamental role in generating new forms of proteins with aberrant functions that compromise the viability of the organism.
Researchers are now pondering on the existence of all those RNAs for which no proteins have been detected and, therefore, for which we currently have no defined biological function. Could it be lost information? Useless information? Do they play new regulatory roles still to be discovered? For now, there are questions without answers.