Researchers at Oregon State University have developed a computer program that represents a key step toward better understanding the connections between mutant genetic material and disease.
Known as bpRNA, the software is a big-data annotation tool for secondary structures in ribonucleic acids.
"It's capable of parsing RNA structures, including complex pseudoknot-containing RNAs, so you end up with an objective, precise, easily-interpretable description of all loops, stems and pseudoknots," said corresponding author David Hendrix. "You also get the positions, sequence and flanking base pairs of each structural feature, which enables us to study RNA structure en masse at a large scale."
RNA works with DNA, the other nucleic acid—so named because they were first discovered in the cell nuclei of living things—to produce the proteins needed throughout the body. DNA contains a person's hereditary information, and RNA delivers the information's coded instructions to the protein-manufacturing sites within the cells. Many RNA molecules do not encode a protein, and these are known as noncoding RNAs.
"There are plenty of examples of disease-associated mutations in noncoding RNAs that probably affect their structure, and in order to statistically analyze why those mutations are linked to disease we have to automate the analysis of RNA structure," said Hendrix, assistant professor of biochemistry and biophysics in the College of Science. "RNA is one of the fundamental, essential molecules for life, and we need to understand RNAs' structure to understand how they function."
Secondary structures are the base-pairing interactions within a single nucleic acid polymer or between two polymers. DNA has mainly fully base-paired double helices, but RNA is single stranded and can form complicated interactions.
Hendrix says bpRNA, presented this month in a paper in Nucleic Acids Research, features the largest and most detailed database to date of secondary RNA structures.
"To be fair it's a meta-database, but our special sauce is the tool to annotate everything," said Hendrix, who is also an assistant professor in the OSU College of Engineering. "Before there was no way of saying where all the structural features were in an automated way. We provide a color-coded map of where everything is. These annotations will enable us to identify statistical trends that may shed light on RNA structure formation and may open the door for machine learning algorithms to predict secondary RNA structure in ways that haven't been possible."
Researchers have successfully tested the tool on more than 100,000 structures, "many of which are very complex, with lots of complex pseudoknots."
"Every day new RNAs are discovered and researchers are making huge progress in understanding their function," Hendrix said. "We're starting to appreciate that the genome is full of noncoding RNAs in addition to messenger RNAs, and they're important biological molecules with big effects on human health and disease."
Explore further: RNA folding: A little cooperation goes a long way
Padideh Danaee et al, bpRNA: large-scale automated annotation and analysis of RNA secondary structure, Nucleic Acids Research (2018). DOI: 10.1093/nar/gky285