Since the completion of the human genome sequence, a question has baffled researchers studying gene control: How is it that humans, being far more complex than the lowly yeast, do not proportionally contain in our genome significantly more gene-control proteins?
Now, a collaborative effort at the Johns Hopkins School of Medicine to examine protein-DNA interactions across the whole genome has uncovered more than 300 proteins that appear to control genes, a newly discovered function for all of these proteins previously known to play other roles in cells. The results, which appear in the October 30 issue of Cell, provide a partial explanation for human complexity over yeast but also throw a curve ball in what we previously understood about protein functions.
"Everyone knows that transcription factors bind to DNA and everyone knows that they bind in a sequence-specific manner," says Heng Zhu, Ph.D., an assistant professor in pharmacology and molecular sciences and a member of the High Throughput Biology Center. "But you only find what you look for, so we looked beyond and discovered proteins that essentially moonlight as transcription factors."
The team suspects that many more proteins encoded by the human genome might also be moonlighting to control genes, which brings researchers to the paradox that less complex organisms, such as plants, appear to have more transcription factors than humans. "Maybe most of our genes are doing double, triple or quadruple the work," says Zhu. "This may be a widespread phenomenon in humans and the key to how we can be so complex without significantly more genes than organisms like plants."
The team set out to figure out which proteins encoded by the genome bind to which DNA sequences. It had been predicted by examining the human genome sequence that about 1,400 to 1,700 of encoded proteins are so-called transcription factors — proteins that bind to specific sequences in DNA to turn a gene on or off. The researchers also included in their study, in addition to these proteins, other types that are known to maintain chromosome structure and bind to structurally different RNA. Also included were proteins that normally relay information within a cell and are not thought to directly come in contact with DNA. In total, they collected nearly 4,200 human proteins together on a protein microarray, or protein "chip."
To identify proteins on that chip that bound DNA directly, the group first reviewed previously published scientific literature and catalogued 460 different, short sequences of DNA that are known or predicted to bind proteins.
One at a time, the team tested each of the 460 DNA sequences against the 4,200 protein-containing chip. In addition to finding many protein-DNA interactions for transcription factors, some confirming previously known interactions, the team found 367 new unconventional DNA binding proteins—proteins known to do other cellular jobs.
"This nearly doubled the number of known protein-DNA interactions," says Jiang Qian, Ph.D., an assistant professor of ophthalmology at Hopkins. "But we only looked at about a fifth of all the proteins in the human genome — there could be hundreds, even thousands more of these unconventional transcription factors that we don't yet know about."
One of the unconventional transcription factors discovered was the protein MAP Kinase 1, also known as ERK2, a protein long studied for its ability to control cell growth and development via its ability to add phosphate groups to other molecules.
"It's one of the best studied proteins out there, but no one ever thought ERK2 could directly regulate gene expression by actually binding to DNA," says Seth Blackshaw, Ph.D., an assistant professor of neuroscience and a member of the High Throughput Biology Center and the Neuroregeneration Program at the Institute for Cell Engineering.
To be certain that ERK2 really does bind DNA and control genes in living cells, the team tested the protein in human cells. They found that ERK2 mutated to no longer bind DNA causes specific genes to be turned on, while both normal ERK2 and ERK2 that's no longer able to chemically modify proteins turn off those same genes. "It clearly acts to repress specific genes," says Blackshaw. "Maybe this will help clear up some of the puzzles that have arisen in ERK2 experiments over the years."
A central question in understanding how genes are controlled is hich of the 20,000 proteins encoded by our genome act on which segments of DNA. "It's not possible to predict this a priori," Blackshaw says. "Someone has to do the experiment — because we just don't know enough about how proteins bind to DNA — patterns have surfaced in this field's 45 year history, but not enough yet to establish any rules."
Source: Johns Hopkins Medical Institutions
Explore further: The origin of the language of life