A new UC San Francisco study highlights the potential importance of the vast majority of human DNA that lies outside of genes within the cell.
The researchers found that about 85 percent of these stretches of DNA make RNA, a molecule that increasingly is being found to play important roles within cells. They also determined that this RNA-making DNA is more likely than other non-gene DNA regions to be associated with inherited disease risks.
The study, published in the free online journal PLOS Genetics on June 20, 2013, is one of the most extensive examinations of the human genome ever undertaken to see which stretches of DNA outside of genes make RNA and which do not.
The researchers—senior author and RNA expert Michael McManus, PhD, UCSF associate professor of microbiology and immunology and a member of the UCSF Diabetes Center, graduate student Ian Vaughn, and postdoctoral fellow Matthew Hangauer, PhD—identified thousands of previously unknown, unique RNA sequences.
"Now that we realize that all these RNA molecules exist and have identified them, the struggle is to understand which are going to have a function that is important," McManus said. "It may take decades to determine this."
The RNA most familiar from textbooks is the messenger RNA that is transcribed from DNA in genes and that encodes the amino acid building blocks of proteins. The transcription of messenger RNA from DNA is a key step in protein production. The rest of the DNA on the cell's chromosomes was once thought not to be transcribed into RNA, and was referred to as junk DNA.
Today, scientists estimate that only 1.5 percent of the genome consists of genes, McManus said. But over the last two decades other kinds of RNA have been identified that are transcribed from DNA outside of gene regions. Some of these RNA molecules play important biological roles, but scientists debate whether few or most of these RNA molecules are likely to be biologically significant.
Among the RNA transcribed by the DNA outside of genes, the UCSF researchers identified thousands of previously unknown RNA sequences of a type called lincRNA. So far, only a handful of lincRNA molecules are known to play significant roles in human biology, McManus said.
Previous research has shown that lincRNAs can have diverse functions. Some control the activity of genes that encode proteins. Others guide protein production in alternative ways.
"RNA is the Swiss army knife of molecules—it can have so many different functions," McManus said.
The development of RNA-sequencing techniques in recent years has made possible the collection of massive amounts of RNA data for the first time.
To identify unique RNA molecules that are transcribed from human DNA, the UCSF researchers re-examined data on RNA transcription that they gathered from more than 125 data sets, obtained in recent years by scientists who studied 24 types of human body tissues. The new study represents one of the largest collections of lincRNAs gathered to date.
McManus said that the findings are in general agreement with those reported in September 2012 by researchers associated with a project called ENCODE, which included among its goals the detection of RNA transcripts within the genome. Many of the cells examined in ENCODE were long-lived laboratory cell lines and cancer cell lines, whereas the data analyzed in the UCSF study was from normal healthy human tissue, McManus said.
Explore further: Final pieces to the circadian clock puzzle found