A new method to dramatically improve the sequencing of metagenomes

February 16, 2016, University of California - San Diego

An international team of computer scientists developed a method that greatly improves researchers' ability to sequence the DNA of organisms that can't be cultured in the lab, such as microbes living in the human gut or bacteria living in the depths of the ocean. They published their work in the Feb. 1 issue of Nature Methods.

The , called TruSPADES, generates via computer so-called Synthetic Long Reads, segments that are about 10,000 of the genome, from the commonly used short reads of just 300 base pairs produced by machines from San Diego-based Illumina.

Using Synthetic Long Reads instead of short reads to assemble a genome is like using entire chapters rather than single sentences to assemble a book, said. So there is a strong incentive to improve sequencing with long reads.

"This is the next generation of sequencing technologies," said Pavel Pevzner, a professor of computer science at the University of California, and the lead author on the study. "It will make a significant impact on the practice of metagenomics sequencing."

Currently, the leaders in the long-read sequencing market, Pacific Biosciences and Oxford Nanopore, generate long reads that can be inaccurate and difficult to use in complex sequencing problems, such as assembling metagenomes—whole colonies of microbes sampled from their natural environment. By contrast, the Synthetic Long Reads are 100 times more accurate and can be rapidly generated on a massive scale to cover a large fraction of bacteria in metagenomes.

To develop their new method, researchers took the shorter reads, 100 to 300 base pairs, equipped with barcodes. They then assembled the short reads together into Synthetic Long Reads by representing them using a de Brujin graph, a method often used in short read sequencing. The graph allows researchers to determine which reads are connected together, resulting in the longer and more accurate Synthetic Long Reads.

The next step is to apply this method to the study of various microbial communities ranging from human to marine microbiomes. Pevzner and co-author Anton Bankevich from St. Petersburg State University, are working with Christopher Dupont, a researcher at the J. Craig Venter Institute, to do just that.

Metagenomics is especially challenging because researchers do not study a single species of bacteria but hundreds of them that live together in a community. When they extract a sample from the community and sequence it, they end up with bits of bacterial genomes from all the organisms in the community. It's very much like trying to solve hundreds of puzzles without knowing which pieces belong to which puzzle. TruSPADES and Synthetic Long Reads will help researchers solve these puzzles.

"This method gives us better results at a much smaller cost," said Dupont. "We are now assembling genomes for organisms we didn't even know existed."

Explore further: Researchers sequence and assemble first full genome of a living organism using technology the size of smartphone

Related Stories

Longer DNA fragments reveal rare species diversity

April 1, 2015

A challenge in metagenomics is that the more commonly used sequencing machines generate data in short lengths, while short-read assemblers may not be able to distinguish among multiple occurrences of the same or similar sequences, ...

New cost-effective genome assembly process developed

May 5, 2013

The U.S. Department of Energy Joint Genome Institute (DOE JGI) is among the world leaders in sequencing the genomes of microbes, focusing on their potential applications in the fields of bioenergy and environment. As a national ...

Recommended for you

Scale-eating fish adopt clever parasitic methods to survive

January 17, 2018

Think of them as extra-large parasites. A small group of fishes—possibly the world's cleverest carnivorous grazers—feeds on the scales of other fish in the tropics. The different species' approach differs: some ram their ...

How living systems compute solutions to problems

January 17, 2018

How do decisions get made in the natural world? One possibility is that the individuals or components in biological systems collectively compute solutions to challenges they face in their environments. Consider that fish ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.