Google strikes deal to preserve DNA data online

October 27, 2011 By Peter Delevett

Concerned that the federal government might not keep funding the world's largest free database of genetic data, Google Inc. has forged a deal with a Mountain View, Calif., startup to keep the information online - and free for researchers.

The Internet giant began talks with DNAnexus last spring, when the National Institutes of Health announced it might have to drop support of the Sequence Read Archive due to funding cuts.

The database is a vast repository for short snippets of decoded by sequencing machines, which spell out the unique combinations that make up a specific person's DNA. Researchers can compare data in the archive to look for similarities and differences between people and better unravel how genetics affects health.

While NIH officials recently announced they'd keep funding the database, "We wanted to make sure we had a Plan B," said DNAnexus Chief Executive Andreas Sundquist. Using the Cloud Storage service, DNAnexus will maintain a "mirror" of the public archive, together with tools Sundquist's company has developed to make it easier for scientists to search the database and share their findings.

Google officials said the Sequence Read Archive is one of the largest datasets ever deposited in Google Storage, but Sundquist predicts there will be thousands of similar databases in the future. That's due both to the growing speed and power of gene sequencers and to the decreasing cost of storing and sharing information in the remote network of web servers known as the cloud.

"DNA sequencing becomes 10 times cheaper every 18 months thanks to hardware improvements," said Sundquist. "It's sort of like Moore's Law on steroids."

A year ago, he says, it cost about $30,000 to sequence a person's entire DNA. Today, that number's down to $4,000. Sundquist believes researchers eventually will sequence everyone on earth and make that data part of each person's .

But given that each genome is about 3 billion letters long, improvements in gene sequencing are creating a massive data management challenge, said Krishna Yeshwant, a partner at Google Ventures who's joining Sundquist's board as part of a separate $15 million investment in DNAnexus.

The SRA database alone, he said, is hundreds of terabits in size, referring to the unit of measure for a trillion bits of computer data.

The decreased cost of gene sequencing is making it possible for genomics to move out of the research lab and into clinical settings, added Yeshwant, a Harvard-educated physician who also practices at Brigham and Women's Hospital in Boston.

"It feels like we're on the cusp of a revolution in genomics and how we think about health care," he said.

Google has a long history of investment in genomics. One of the first companies to join its venture arm's portfolio was Adimab, a New Hampshire startup that helps discover how antibodies can be turned into drugs. Google - and co-founder Sergey Brin - also has poured millions into Mountain View's 23andme, co-founded by Brin's wife to help consumers better understand their own DNA while building a massive database for researchers to study the genetic underpinnings of disease.

Still, the head of another local genomics startup said he was underwhelmed by Google's announcement that it would create a mirror of the SRA database.

"Part of the reason it was being discontinued is that NIH prioritized it as low value," said John West, CEO of Palo Alto, Calif.-based Personalis. "It's the most comprehensive database of really raw DNA sequencing data, but it's not very organized, and it's not easy to use."

Sundquist, in fact, would agree, which is why he's hoping the data management tools his team of 25 has developed will make the repository more useful.

He says his version of the database will include a more user-friendly interface to help scientists browse, download, analyze and share data. He envisions researchers plugging data into his virtual data center, finding out if a given person's genetic mutations exist elsewhere in the and using those discoveries to figure out how the mutation impacts health - or how a given drug may affect certain people.

Sundquist, 32, founded the company two years ago while working on a Ph.D. in computer science at Stanford. Although he had no prior background in medicine, he was fascinated by the data challenges posed by the rapid improvements in gene sequencing.

"In less than five years," he predicted, "the cost of DNA sequencing will be on par with the cost of other routine lab tests."

Explore further: Enabling easy access to DNA sequence information


Related Stories

Enabling easy access to DNA sequence information

May 10, 2010

The European Nucleotide Archive (ENA) is launched today, consolidating three major sequence resources to become Europe's primary access point to globally comprehensive DNA and RNA sequence information. The ENA is freely available ...

Researchers assemble second non-human primate genome

February 9, 2006

A multi-center team has deposited the draft genome sequence of the rhesus macaque monkey into free public databases for use by the worldwide research community, the National Human Genome Research Institute (NHGRI), one of ...

Scientists to explore nano advancements in DNA sequencing

October 1, 2007

UC Irvine’s Henry Samueli School of Engineering has been awarded $2.18 million to blend traditional DNA sequencing techniques with cutting-edge nanotechnology to develop a faster and less costly method of analysis. The ...

Recommended for you

EU copyright law passes key hurdle

June 20, 2018

A highly disputed European copyright law that could force online platforms such as Google and Facebook to pay for links to news content passed a key hurdle in the European Parliament on Wednesday.

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

5 / 5 (1) Oct 27, 2011
To be immortalized as a cybernetic organism by Google! But where do I get an invite?

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.