Bioengineers create first online search engine for functional genomics data

May 2, 2016, University of California - San Diego
Screenshot from GeNemo, the first online search engine for functional genomics data. GeNemo is free for public use at: http://www.genemo.org. Credit: Sheng Zhong / UC San Diego bioengineering

University of California San Diego bioengineers have created what they believe to be the first online search engine for functional genomics data. This work from the Sheng Zhong bioengineering lab at UC San Diego was just published online by the journal Nucleic Acids Research. This new search engine, called GeNemo, is free for public use at: www.genemo.org.

GeNemo addresses a pressing challenge: effectively searching functional genomic data from online data repositories. (The name GeNemo is a combination of "Ge" from the word gene and Nemo from the movie "Finding Nemo.")

The functions of an organism's genome, captured in functional genomic data, are directly relevant to health and disease. Functional genomics data record the diverse activities of every piece of an organism's genome. The new search system may lead researchers to uncover the functional aspects in specific parts of genomes that are associated with normal physiology or disease of specific organs and tissues.

GeNemo queries user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions.

Instead of just "searching by text," the new tool allows researchers to search inside the functional data. Searching for binding patterns that are similar to that of a novel transcription factor is just one example.

"If you think of functional genomic data files as video files, then the 'text search' is like searching by keywords in the title or the description of a video file. The 'inside data search' is like searching for a video clip by pattern matching within the video itself," explained Zhong.

"Functional genomic assays are producing massive amounts of data, in challenging data types. We have developed an online tool that empowers users to input any complete or partial functional genomic dataset, for example, a binding intensity file like bigWig, or a peak file," explained UC San Diego bioengineering scientist Xiaoyi Cao, a joint first author on the paper. "GeNemo reports any genomic regions, ranging from 100 bases to 100,000 bases, from any of the online ENCODE datasets that share similar functional patterns such as binding, modification and accessibility."

Functional genomic assay data opportunities

Leveraging DNA sequencing such as a high-throughput readout, functional genomic assays can interrogate genome-wide distributions of transcription factor binding (ChIP-seq), epigenetic modifications (ChIP-seq), regulatory regions (DNase-seq, FAIRE-seq) and other functional outcomes. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science.

According to the researchers, this is the first software to be released for executing functional genomic data searches online.

"I am excited to see how different research teams from around the world use this powerful new tool to make better use of the massive amounts of functional genomic data that is being generated every day," said Zhong.

Explore further: New computer program can help uncover hidden genomic alterations that drive cancers

More information: Yongqing Zhang et al, GeNemo: a search engine for web-based functional genomic data, Nucleic Acids Research (2016). DOI: 10.1093/nar/gkw299

Related Stories

One Codex in open beta for genomic data search

August 17, 2014

Data, data everywhere and now as ever researchers need the best tools to make the data useful. In medicine, searching through genomic data can take some time. A startup called One Codex hopes to make difference with their ...

New bioinformatics tool to visualize transcriptomes

March 9, 2014

ZENBU, a new, freely available bioinformatics tool developed at the RIKEN Center for Life Science Technology in Japan, enables researchers to quickly and easily integrate, visualize and compare large amounts of genomic information ...

Recommended for you

Semimetals are high conductors

March 18, 2019

Researchers in China and at UC Davis have measured high conductivity in very thin layers of niobium arsenide, a type of material called a Weyl semimetal. The material has about three times the conductivity of copper at room ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.