Popularly dubbed "the book of life," the human genome is extraordinarily difficult to read. But without full knowledge of its grammar and syntax, the genome's 2.9 billion base-pairs of adenine and thymine, cytosine and guanine provide limited insights into humanity's underlying genetics.
In a paper published in the July 1, 2012 issue of the journal Nature, researchers at the Ludwig Institute for Cancer Research and the University of California, San Diego School of Medicine open the book further, mapping for the first time a significant portion of the functional sequences of the mouse genome, the most widely used mammalian model organism in biomedical research.
"We've known the precise alphabet of the human genome for more than a decade, but not necessarily how those letters make meaningful words, paragraphs or life," said Bing Ren, PhD, head of the Laboratory of Gene Regulation at the Ludwig Institute for Cancer Research at UC San Diego. "We know, for example, that only one to two percent of the functional genome codes for proteins, but that there are highly conserved regions in the genome outside of protein-coding that affect genes and disease development. It's clear these regions do something or they would have changed or disappeared."
Chief among those regions are cis-regulatory elements, key stretches of DNA that appear to regulate the transcription of genes. Misregulation of genes can result in diseases like cancer. Using high-throughput sequencing technologies, Ren and colleagues mapped nearly 300,000 mouse cis-regulatory elements in 19 different types of tissue and cell. The unprecedented work provided a functional annotation of nearly 11 percent of the mouse genome, and more than 70 percent of the conserved, non-coding sequences shared with other mammalian species, including humans.
As expected, the researchers identified different sequences that promote or start gene activity, enhance its activity and define where it occurs in the body during development. More surprising, said Ren, was that the structural organization of the cis-regulatory elements are grouped into discrete clusters corresponding to spatial domains. "It's a case of form following function," he said. "It makes sense."
While the research is fundamentally revealing, Ren noted it is also just a beginning, a partial picture of the functional genome. Additional studies will be needed in other types of cells and at different stages of development.
"We've mapped and understand 11 percent of the genome," said Ren. "There's still a long way to march."
Explore further: What's driving specific patterns of gene expression among cell types?