One Codex in open beta for genomic data search

Aug 17, 2014 by Nancy Owano weblog

Data, data everywhere and now as ever researchers need the best tools to make the data useful. In medicine, searching through genomic data can take some time. A startup called One Codex hopes to make difference with their genetic search platform that can process data sets quickly. A report on their work on Friday in TechCrunch noted the advantage of One Codex speed. "Currently," wrote Julian Chokkattu, "the most commonly used tool for genome searching is by using an algorithm called BLAST, Basic Local Alignment Search Tool, which compares primary biological sequence information." For Nick Greenfield, cofounder of One Codex, uploading a file to BLAST took two minutes and 30 seconds to process, compared with the One Codex system where the number was less than 1/20th of a second. The company defines One Codex as a search engine for genomic data. The TechCrunch piece describes what they offer as a service platform for genomics. Apart from using search technology," said Chokkattu, the platform also acts as an indexed, curated reference.

The company said that it can search the world's largest index of bacterial, viral, and fungal genomes. A key advantage is speed. The product can, said the company, "process next-generation datasets in minutes, not days (millions of DNA base pairs per second)."

The two founders are Nick Greenfield, former data scientist, and Nik Krumm, who has a PhD in genome sciences from the University of Washington.

Sample applications would be in clinical diagnostics, food safety and biosecurity. Right now, said TechCrunch, the company is focusing on testing their platform with hospitals and agencies. One Codex is in open beta.

Scientific interest in being able to faster has been in evidence for some years. In 2012, MIT's news office reported on a study in Nature Biotechnology, where MIT and Harvard researchers described an algorithm "that drastically reduces the time it takes to find a particular gene sequence in a database of genomes. Moreover, the more genomes it's searching, the greater the speedup it affords, so its advantages will only compound as more data is generated."

The authors of that paper, titled "Compressive genomics," said, "In the past two decades, genomic sequencing capabilities have increased exponentially, outstripping advances in computing power. Extracting new insights from the data sets currently being generated will require not only faster computers, but also smarter algorithms." They stated that although compression schemes for BLAST and BLAT that they presented yield an increase in computational speed and in scaling, "they are only a first step."

Explore further: Team develops tool to better visualize, analyze human genomic data

More information: One Codex: onecodex.com/

add to favorites email to friend print save as pdf

Related Stories

Searching genomic data faster with new algorithm

Jul 10, 2012

In 2001, the Human Genome Project and Celera Genomics announced that after 10 years of work at a cost of some $400 million, they had completed a draft sequence of the human genome. Today, sequencing a human genome is something ...

Samtools CRAMS in support for improved compression formats

Aug 15, 2014

Computer scientists at the Wellcome Trust Sanger Institute have released a major upgrade of Samtools, one of the most popular next-generation sequence analysis tools. The revised Samtools 1.0 enables researchers to easily ...

Interpretation will crack the microbial language code

Jun 30, 2014

In the environment, microbes often communicate with each other using small molecules. Ribosomally synthesized and posttranslationally modified peptides produced by microbes represent a class of metabolites ...

Recommended for you

For legume plants, a new route from shoot to root

Sep 19, 2014

A new study shows that legume plants regulate their symbiotic relationship with soil bacteria by using cytokinins—signaling molecules— that are transmitted through the plant structure from leaves into ...

Controlling the transition between generations

Sep 18, 2014

Rafal Ciosk and his group at the FMI have identified an important regulator of the transition from germ cell to embryonic cell. LIN-41 prevents the premature onset of embryonic transcription in oocytes poised ...

User comments : 0