Search tool for gene expression databases could uncover therapeutic targets, biological processes

Oct 01, 2013

A new computational tool developed by U.S. and Israeli scientists will help scientists exploit the massive databases of gene expression experimental results that have been created over the past decade. Researchers say it could uncover new links between diseases and treatments and provide new insights into biological processes.

The team, headed by Ziv Bar-Joseph of Carnegie Mellon University, reports in the October issue of the journal Nature Methods that the , called ExpressionBlast, enables searches based directly on experimental values, rather than keywords.

The researchers already have used ExpressionBlast to uncover intriguing clues about SIRT6, the first gene shown to extend lifespan in mice and thus a potentially important drug target. By mining experimental data stored in a public repository called the Gene Expression Omnibus (GEO) maintained by the National Center for Biotechnology Information, they found that SIRT6 may be involved with functions that include immune response, metabolism and the regulation of gender-specific genes.

"Because so little is known about SIRT6, it would be difficult to search the hundreds of thousands of GEO datasets using keywords and, without other guidance, it would be practically impossible to find other experiments with similar to SIRT6," said Bar-Joseph, an associate professor of computational biology and machine learning. "ExpressionBlast enabled us to take SIRT6 from just two mouse experiments and find other experimental data in GEO with similar expression patterns."

The tool is available online, http://www.expression.cs.cmu.edu/. The search engine enables researchers to search for expression patterns that are similar or opposite to their own results and can search within and across species. Guy Zinman, Shoshana Naiman, Yariv Kanfi and Haim Cohen of Bar-Ilan University worked with Bar-Joseph to develop ExpressionBlast and are co-authors of the journal report. Their intention was to develop a tool for gene expression queries that would be the equivalent of Blast, a two-decade-old tool for searching gene sequence databases that remains one of the most widely used tools in bioinformatics.

Genes encode the information necessary for life, while gene expression is the process by which that genetic information is transformed into proteins and by which genes are regulated. Understanding gene expression thus is critical for understanding biological and disease processes. This information is so important that, for the past decade or so, most leading journals have required researchers who publish papers on to submit their to public repositories such as GEO.

GEO alone holds data from more than 1 million microarrays. Each of these microarrays might contain up to 40,000 numerical values – which indicate which genes are over- or underexpressed, and by how much. GEO and the European Bioinformatics Institute's ArrayExpress thus represent a treasure trove of potential discoveries. But existing searches are often dependent on keyword summaries submitted by each researcher, or require manual comparisons of microarrays.

ExpressionBlast uses novel, automated and scalable text analysis algorithms to transform the unstructured data in GEO so that it can be systematically searched. The researchers have thus far processed tens of thousands of expression series representing hundreds of thousands of individual arrays across several species. Once processed in this way, the data can be accessed easily via a graphical interface. This work was supported by a grant from the National Institutes of Health and a National Science Foundation Innovation Corps (I-Corps) award.

Explore further: Novel approach to gene regulation can activate multiple genes simultaneously

Related Stories

Scientists who share data publicly receive more citations

Oct 01, 2013

A new study finds that papers with data shared in public gene expression archives received increased numbers of citations for at least five years. The large size of the study allowed the researchers to exclude ...

Recommended for you

User comments : 0

More news stories

Biologists help solve fungi mysteries

(Phys.org) —A new genetic analysis revealing the previously unknown biodiversity and distribution of thousands of fungi in North America might also reveal a previously underappreciated contributor to climate ...

Researchers successfully clone adult human stem cells

(Phys.org) —An international team of researchers, led by Robert Lanza, of Advanced Cell Technology, has announced that they have performed the first successful cloning of adult human skin cells into stem ...

NASA's space station Robonaut finally getting legs

Robonaut, the first out-of-this-world humanoid, is finally getting its space legs. For three years, Robonaut has had to manage from the waist up. This new pair of legs means the experimental robot—now stuck ...

Ex-Apple chief plans mobile phone for India

Former Apple chief executive John Sculley, whose marketing skills helped bring the personal computer to desktops worldwide, says he plans to launch a mobile phone in India to exploit its still largely untapped ...

Filipino tests negative for Middle East virus

A Filipino nurse who tested positive for the Middle East virus has been found free of infection in a subsequent examination after he returned home, Philippine health officials said Saturday.

Egypt archaeologists find ancient writer's tomb

Egypt's minister of antiquities says a team of Spanish archaeologists has discovered two tombs in the southern part of the country, one of them belonging to a writer and containing a trove of artifacts including reed pens ...

Airbnb rental site raises $450 mn

Online lodging listings website Airbnb inked a $450 million funding deal with investors led by TPG, a source close to the matter said Friday.