Ribonucleotides, units of RNA, can become embedded in genomic DNA during processes such as DNA replication and repair, affecting the stability of the genome by contributing to DNA fragility and mutability. Scientists have known about the presence of ribonucleotides in DNA, but until now had not been able to determine exactly what they are and where they are located in the DNA sequences.
Now, researchers have developed and tested a new technique known as ribose-seq that allows them to determine the full profile of ribonucleotides embedded in genomic DNA. Using ribose-seq, they have found widespread but not random incorporation and "hotspots" where the RNA insertions accumulate in the nuclear and mitochondrial DNA of a commonly-studied species of budding yeast. Ribose-seq could be used to locate ribonucleotides in the DNA of a wide range of other organisms, including that of humans.
"Ribonucleotides are the most abundant non-standard nucleotides that can be found in DNA, but until now there has not been a system to determine where they are located in the DNA, or to identify specifically which type they are," said Francesca Storici, an associate professor in the School of Biology at the Georgia Institute of Technology. "Because they change the way that DNA works, in both its structure and function, it is important to know their identity and their sites of genomic incorporation."
A description of the ribose-seq method and what it discovered in the DNA of the budding yeast species Saccharomyces cerevisiae will be reported on January 26 in the journal Nature Methods. The findings resulted from collaboration between researchers in Storici's laboratory at Georgia Tech - with graduate students Kyung Duk Koh and Sathya Balachander - and at the University of Colorado Anschutz Medical School with assistant professor Jay Hesselberth.
The research was supported by the National Science Foundation, the Georgia Research Alliance, the American Cancer Society, the Damon Runyon Cancer Research Foundation, and the University of Colorado Golfers Against Cancer.
Because of the extra hydroxyl (OH) group in the ribonucleotides, their presence distorts the DNA and creates sensitive sites where reactions with other molecules can take place. Of particular interest are reactions between the OH and alkaline solutions, which can make the DNA more susceptible to cleavage.
Ribose-seq takes advantage of this reaction with the hydroxyl group to launch the process of identifying the genomic spectrum of ribonucleotide incorporation. Researchers first cleave the DNA samples at the ribonucleotides, then take the resulting fragments through a specialized process that concludes with generation of a library of DNA sequences that contain the sites of ribonucleotide incorporation and their upstream sequence. High-throughput sequencing of the library and alignment of sequencing reads to a reference genome identifies the profile of rNMP incorporation events.
"Ribose-seq is specific to directly capturing ribonucleotides embedded in DNA and does not capture RNA primers or Okazaki fragments formed during DNA replication, breaks or abasic sites in DNA," Storici noted.
"For this reason, ribose-seq has application for rNMP mapping in any genomic DNA, from large nuclear genomes to small genomic molecules such as plasmids and mitochondrial DNA, with no need of standardization procedures," she said. "It also allows mapping rNMPs even in conditions in which the DNA is exposed to environmental stressors that damage the DNA by generating breaks and/or abasic sites."
The extra hydroxyl group found in the ribonucleotides is key to the ribose-seq technique, said Koh, the paper's first author. "The OH group is specific to the ribonucleotides," he explained. "That allowed us to build a new tool for recognizing specifically where the ribonucleotides are located."
The high-throughput sequencing and initial data analysis were done in the Hesselberth laboratory in the Department of Biochemistry and Molecular Genetics at the University of Colorado Anschutz Medical School.
To validate their method, the researchers tested ribose-seq on the much-studied yeast species. The analyses revealed a strong preference for the cytidine and guanosine bases at the ribonucleotide sites.
"The ribonucleotides are not randomly distributed, and there is some preference for specific base sequences and specific base composition of the ribonucleotide itself," said Koh. "By looking at the non-random distribution, we found several hotspots in which the ribonucleotides are incorporated into the genome."
Knowledge of where the ribonucleotides cluster could help identify areas of greatest potential for genome instability and lead to a better understanding of how they affect the properties and activities of DNA.
"The fact that we see biases in the base compositions of the ribonucleotides allows us to tell which base is more likely to be incorporated into the DNA," Koh explained. "If there are specific signatures of genomic instability that are caused by the ribonucleotides, this will allow us to narrow down the locations and know where they are more likely to be found."
A next step will be to test ribose-seq on other DNA, Koh said. "Our technique could potentially be applied to any genome of any cell type from any organism as long as genomic DNA can be extracted from it," he added. "It is independent of specific organisms."
Beyond repair and replication processes, ribonucleotides can also be created in DNA as a result of damage caused by drugs, environmental stressors and other factors. The ribose-seq method could also allow scientists to study the impact of these processes.
"Ribose-seq should allow us to better understand the impact of ribonucleotides on the structure and function of DNA," said Storici. "Identifying specific signatures of ribonucleotide incorporation in DNA may represent novel biomarkers for human diseases such as cancer, and other degenerative disorders."
Explore further: Study identifies mechanisms cells use to remove bits of RNA from DNA strands
Ribose-seq: global mapping of ribonucleotides embedded in genomic DNA , Nature Methods, DOI: 10.1038/nmeth.3259