Computational method dramatically speeds up estimates of gene expression

Apr 20, 2014

With gene expression analysis growing in importance for both basic researchers and medical practitioners, researchers at Carnegie Mellon University and the University of Maryland have developed a new computational method that dramatically speeds up estimates of gene activity from RNA sequencing (RNA-seq) data.

With the new method, dubbed Sailfish after the famously speedy fish, estimates of that previously took many hours can be completed in a few minutes, with accuracy that equals or exceeds previous methods. The researchers' report on their new method is being published online April 20 by the journal Nature Biotechnology.

Gigantic repositories of RNA-seq data now exist, making it possible to re-analyze experiments in light of new discoveries. "But 15 hours a pop really starts to add up, particularly if you want to look at 100 experiments," said Carl Kingsford, an associate professor in CMU's Lane Center for Computational Biology. "With Sailfish, we can give researchers everything they got from previous methods, but faster."

Though an organism's genetic makeup is static, the activity of individual genes varies greatly over time, making gene expression an important factor in understanding how organisms work and what occurs during disease processes. Gene activity can't be measured directly, but can be inferred by monitoring RNA, the molecules that carry information from the genes for producing proteins and other cellular activities. RNA-seq is a leading method for producing these snapshots of gene expression; in genomic medicine, it has proven particularly useful in analyzing certain cancers.

The RNA-seq process results in short sequences of RNA, called "reads." In previous methods, the RNA molecules from which they originated could be identified and measured only by painstakingly mapping these reads to their original positions in the larger molecules.

But Kingsford, working with Rob Patro, a post-doctoral researcher in the Lane Center, and Stephen M. Mount, an associate professor in Maryland's Department of Cell Biology and Molecular Genetics and its Center for Bioinformatics and Computational Biology, found that the time-consuming mapping step could be eliminated. Instead, they found they could allocate parts of the reads to different types of RNA molecules, much as if each read acted as several votes for one molecule or another.

Without the mapping step, Sailfish can complete its RNA analysis 20-30 times faster than previous methods.

This numerical approach might not be as intuitive as a map to a biologist, but it makes perfect sense to a computer scientist, Kingsford said. Moreover, the Sailfish method is more robust—better able to tolerate errors in the reads or differences between individuals' genomes. These errors can prevent some reads from being mapped, he explained, but the Sailfish method can make use of all the RNA read "votes," which improves the method's accuracy.

Explore further: International consortium to study plant fertility evolution

More information: The Sailfish code has been released and is available for download at www.cs.cmu.edu/~ckingsf/software/sailfish/

Paper: Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms, Nature Biotechnology, DOI: 10.1038/nbt.2862

Related Stories

Rapid and accurate mRNA detection in plant tissues

Apr 17, 2014

Gene expression is the process whereby the genetic information of DNA is used to manufacture functional products, such as proteins, which have numerous different functions in living organisms. Messenger RNA (mRNA) serves ...

New bioinformatics tool to visualize transcriptomes

Mar 09, 2014

ZENBU, a new, freely available bioinformatics tool developed at the RIKEN Center for Life Science Technology in Japan, enables researchers to quickly and easily integrate, visualize and compare large amounts of genomic information ...

Recommended for you

Study on pesticides in lab rat feed causes a stir

18 hours ago

French scientists published evidence Thursday of pesticide contamination of lab rat feed which they said discredited historic toxicity studies, though commentators questioned the analysis.

International consortium to study plant fertility evolution

22 hours ago

Mark Johnson, associate professor of biology, has joined a consortium of seven other researchers in four European countries to develop the fullest understanding yet of how fertilization evolved in flowering plants. The research, ...

Making the biofuels process safer for microbes

Jul 02, 2015

A team of investigators at the University of Wisconsin-Madison and Michigan State University have created a process for making the work environment less toxic—literally—for the organisms that do the heavy ...

Why GM food is so hard to sell to a wary public

Jul 02, 2015

Whether commanding the attention of rock star Neil Young or apparently being supported by the former head of Greenpeace, genetically modified food is almost always in the news – and often in a negative ...

The hidden treasure in RNA-seq

Jul 01, 2015

Michael Stadler and his team at the Friedrich Miescher institute for Biomedical Research (FMI) have developed a novel computational approach to analyze RNA-seq data. By comparing intronic and exonic RNA reads, ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.