New functions for 'junk' DNA?

Mar 31, 2014
This image shows the evolutionary relationships among the species analyzed for conserved non-coding sequences. 'Myr' stands for million years ago. Ellipses are approximate times of whole-genome duplications. Credit: Diane Burgess

DNA is the molecule that encodes the genetic instructions enabling a cell to produce the thousands of proteins it typically needs. The linear sequence of the A, T, C, and G bases in what is called coding DNA determines the particular protein that a short segment of DNA, known as a gene, will encode. But in many organisms, there is much more DNA in a cell than is needed to code for all the necessary proteins. This non-coding DNA was often referred to as "junk" DNA because it seemed unnecessary. But in retrospect, we did not yet understand the function of these seemingly unnecessary DNA sequences.

We now know that non-coding DNA can have important functions other than encoding proteins. Many non-coding sequences produce RNA molecules that regulate gene expression by turning them on and off. Others contain enhancer or inhibitory elements. Recent work by the international ENCODE (Encyclopedia of DNA Elements) Project (1, 2) suggested that a large percentage of non-coding DNA, which makes up an estimated 95% of the human genome, has a function in gene regulation. Thus, it is premature to say that "junk" DNA does not have a function—we just need to find out what it is!

To help understand the importance of this large amount of non-coding DNA in plants, Diane Burgess and Michael Freeling at the University of California, Berkeley have identified numerous conserved non-coding sequences (CNSs) of DNA that are found in a wide variety of plant species, including rice, banana, and cacao. DNA sequences that are highly conserved, meaning that they are identical or nearly so in a variety of organisms, are likely to have important functions in basic biological processes. For example, the gene encoding ribosomal RNA, an essential part of the protein-synthesizing machinery needed by cells of all organisms, is highly conserved. Changes in the sequence of this key molecule are poorly tolerated, so ribosomal RNA sequences have changed relatively little over millions of years of evolution.

To identify the most highly conserved plant CNSs, Burgess and Freeling compared the genome (one copy of all the DNA in an organism) of the model plant Arabidopsis, a member of the mustard family, with the genome of columbine, a distantly related plant of the buttercup family. The phylogenetic tree (see figure) shows the evolutionary relationships among the dicot (yellow) and monocot (blue) species they studied. Branch points represent points of divergence of two species from a common ancestor. Sequences in common between these two plants, which diverged over 130 million years ago, are likely to have important functions or they would have been lost due to random mutations or insertions or deletions.

They found over 200 CNSs in common between these distantly related species. In addition, 59 of these CNSs were also found in monocots, which are even more distant evolutionarily, and these were termed deep CNSs. Finally, they showed that 51 of these appear to be found in all flowering plants, based on their occurrence in Amborella, a flowering plant that diverged from all of the above plants even before the monocot-dicot split (see figure).

So what could be the function of these deep CNSs? We can get clues by analyzing the types of genes with which these CNSs are associated. The researchers found that nearly all of the deep CNSs are associated with genes involved in basic and universal biological processes in flowering plants—processes such as development, response to hormones, and regulation of gene expression. They found that the majority of these CNSs are associated with genes involved in tissue and organ development, post-embryonic differentiation, flowering, and production of reproductive structures. Others are associated with hormone- and salt-responsive genes or with genes encoding , which are regulatory proteins that control gene expression by turning other genes on and off.

In addition, they showed that these CNSs are enriched for binding sites for transcription factors, and propose that the function of some of this non-coding DNA is to act as a scaffold for organization of the gene expression machinery. The binding sites they found are known sequences implicated in other plants as necessary for response to biotic and abiotic stress, light, and hormones. Furthermore, they discovered that a number of the CNSs could produce RNAs that have extensive double-stranded regions. These double-stranded regions have been shown to be involved in RNA stability, degradation, and in regulation of gene expression. Twelve of the most 59 highly conserved CNSs are associated with genes whose protein products interact with RNA. Clearly, these DNA sequences are not merely "junk!"

Now that Burgess and Freeling have identified the most highly conserved non-coding DNA sequences in flowering plants, future scientists have a better idea of which regions of the genome to focus on for functional studies. Do the predicted transcription factor-binding sites actually bind known or novel transcription factors? Do CNSs organize or regulate the machinery? Do CNSs encode RNAs that regulate fundamental processes in plants? The answers to these and many related questions will be easier to answer now that we have this set of deep CNSs that are likely to play important roles in basic cellular processes in plants.

Explore further: Heaven scent: Finding may help restore fragrance to roses

More information: References

(1) National Human Genome Research Institute (see www.genome.gov/10005107)
(2) Genome Research, Vol. 17, June 2007, special issue on ENCODE.

Related Stories

Research sheds new light on heritability of disease

Jan 16, 2014

A group of international researchers, led by a research fellow in the Harvard Medical School-affiliated Institute for Aging Research at Hebrew SeniorLife, published a paper today in Cell describing a study aimed at better ...

Protein coding 'junk genes' may be linked to cancer

Nov 17, 2013

By using a new analysis method, researchers at Karolinska Institutet and Science for Life Laboratory (SciLifeLab) in Sweden have found close to one hundred novel human gene regions that code for proteins. A number of these ...

Sea anemone is genetically half animal, half plant

Mar 18, 2014

The team led by evolutionary and developmental biologist Ulrich Technau at the University of Vienna discovered that sea anemones display a genomic landscape with a complexity of regulatory elements similar ...

Genetic switches play big role in human evolution

Jun 12, 2013

(Phys.org) —A Cornell study offers further proof that the divergence of humans from chimpanzees some 4 million to 6 million years ago was profoundly influenced by mutations to DNA sequences that play roles ...

Recommended for you

Study on pesticides in lab rat feed causes a stir

Jul 02, 2015

French scientists published evidence Thursday of pesticide contamination of lab rat feed which they said discredited historic toxicity studies, though commentators questioned the analysis.

International consortium to study plant fertility evolution

Jul 02, 2015

Mark Johnson, associate professor of biology, has joined a consortium of seven other researchers in four European countries to develop the fullest understanding yet of how fertilization evolved in flowering plants. The research, ...

Making the biofuels process safer for microbes

Jul 02, 2015

A team of investigators at the University of Wisconsin-Madison and Michigan State University have created a process for making the work environment less toxic—literally—for the organisms that do the heavy ...

Why GM food is so hard to sell to a wary public

Jul 02, 2015

Whether commanding the attention of rock star Neil Young or apparently being supported by the former head of Greenpeace, genetically modified food is almost always in the news – and often in a negative ...

The hidden treasure in RNA-seq

Jul 01, 2015

Michael Stadler and his team at the Friedrich Miescher institute for Biomedical Research (FMI) have developed a novel computational approach to analyze RNA-seq data. By comparing intronic and exonic RNA reads, ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.