October 2, 2017

International competition benchmarks metagenomics software

Communities of bacteria live everywhere: inside our bodies, on our bodies and all around us. The human gut alone contains hundreds of species of bacteria that help digest food and provide nutrients, but can also make us sick. To learn more about these groups of bacteria and how they impact our lives, scientists need to study them. But this task poses challenges, because taking the bacteria into the laboratory is either impossible or would disrupt the biological processes the scientists wish to study.

To bypass these difficulties, scientists have turned to the field of metagenomics. In metagenomics, researchers use algorithms to piece together DNA from an environmental sample to determine the type and role of bacteria present. Unlike established fields such as chemistry, where researchers evaluate their results against a set of known standards, metagenomics is a relatively young field that lacks such benchmarks.

Mihai Pop, a professor of computer science at the University of Maryland with a joint appointment in the University of Maryland Institute for Advanced Computer Studies, recently helped judge an international challenge called the Critical Assessment of Metagenome Interpretation (CAMI), which benchmarked metagenomics software. The results were published in the journal Nature Methods on October 2, 2017.

"There's no one algorithm that we can say is the best at everything," said Pop, who is also co-director of the Center for Health-related Informatics and Bioimaging at UMD. "What we found was that one tool does better in one context, but another does better in another context. It is important for researchers to know that they need to choose software based on the specific questions they are trying to answer."

The study's results were not surprising to Pop, because of the many challenges metagenomics software developers face. First, DNA analysis is challenging in metagenomics because the recovered DNA often comes from the field, not a tightly controlled laboratory environment. In addition, DNA from many organisms—some of which may not have known genomes—mingle together in a sample, making it difficult to correctly assemble, or piece together, individual genomes. Moreover, DNA degrades in harsh environments.

"I like to think of metagenomics as a new type of microscope," Pop said. "In the old days, you would use a microscope to study bacteria. Now we have a much more powerful microscope, which is DNA sequencing coupled with advanced algorithms. Metagenomics holds the promise of helping us understand what bacteria do in the world. But first we need to tune that microscope."

CAMI's leader invited Pop to help evaluate the submissions by challenge participants because of his expertise in genome and metagenome assembly. In 2009, Pop helped publish Bowtie, one of the most commonly used software packages for assembling genomes. More recently, he collaborated with the University of Maryland School of Medicine to analyze hundreds of thousands of gene sequences as part of the largest, most comprehensive study of childhood diarrheal diseases ever conducted in developing countries.

"We uncovered new, unknown bacteria that cause diarrheal diseases, and we also found interactions between bacteria that might worsen or improve illness," Pop said. "I feel that's one of the most impactful projects I've done using metagenomics."

For the competition, CAMI researchers combined approximately 700 microbial genomes and 600 viral genomes with other DNA sources and simulated how such a collection of DNA might appear in the field. The participants' task was to reconstruct and analyze the genomes of the simulated DNA pool.

CAMI researchers scored the participants' submissions in three areas: how well they assembled the fragmented genomes; how well they "binned," or organized, DNA fragments into related groups to determine the families of organisms in the mixture; and how well they "profiled," or reconstructed, the identity and relative abundance of the organisms present in the mixture. Pop contributed metrics and software for evaluating the submitted assembled genomes.

Nineteen teams submitted 215 entries using six genome assemblers, nine binners and 10 profilers to tackle this challenge.

The results showed that for assembly, algorithms that pieced together a genome using different lengths of smaller DNA fragments outperformed those that used DNA fragments of a fixed length. However, no assemblers did well at picking apart different, yet similar genomes.

For the binning task, the researchers found tradeoffs in how accurately the software programs identified the group to which a particular DNA fragment belonged, versus how many DNA fragments the software assigned to any groups. This result suggests that researchers need to choose their binning software based on whether accuracy or coverage is more important. In addition, the performance of all binning algorithms decreased when samples included multiple related genomes.

In profiling, software either recovered the relative abundance of bacteria in the sample better or detected organisms better, even at very low quantities. However, the latter algorithms identified the wrong organism more often.

Going forward, Pop said the CAMI group will continue to run new challenges with different data sets and new evaluations aimed at more specific aspects of software performance. Pop is excited to see scientists use the benchmarks to address research questions in the laboratory and the clinic.

"The field of metagenomics needs standards to ensure that results are correct, well validated and follow best practices," Pop said. "For instance, if a doctor is going to stage an intervention based on results from metagenomic software, it's essential that those results be correct. Our work provides a roadmap for choosing appropriate software."

More information: Alexander Sczyrba et al, Critical Assessment of Metagenome Interpretation − a benchmark of computational metagenomics software, Nature Methods (2017). DOI: 10.1101/099127

Journal information: Nature Methods

Provided by University of Maryland

Citation: International competition benchmarks metagenomics software (2017, October 2) retrieved 11 July 2024 from https://phys.org/news/2017-10-international-competition-benchmarks-metagenomics-software.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Better microbial genome binning with metaBAT

5 shares

Feedback to editors

International competition benchmarks metagenomics software

Canadian wildfire smoke dispersal worsened by coincident cyclones, study suggests

Air pollution harms pollinators more than pests, study finds

Hexagonal metallic-mean approximants help bridge gap between quasicrystals and modulated structures

Opening the right doors: New work reveals 'jumping gene' control mechanisms

Researchers develop model to study heavy-quark recombination in quark-gluon plasma

A new species of extinct crocodile relative rewrites life on the Triassic coastline

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

Mars likely had cold and icy past, new study finds

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

New tools are needed to make water affordable, says study

Relevant PhysicsForums posts

Is meat broth really nutritious?

Havana Syndrome

Innovative ideas and technologies to help folks with disabilities

COVID Virus Lives Longer with Higher CO2 In the Air

Conflicting interpretations of rosemary oil study

Who chooses official designations for individual dolphins, such as FB15, F153, F286?

Better microbial genome binning with metaBAT

Scientists develop robust method for analysis of intestinal bacteria

Lab creates bioinformatics tool for metagenome analysis

Closest relatives of Baltic Sea plankton are found in brackish North American waters

A new method to dramatically improve the sequencing of metagenomes

Microbial genomes help propose phylum name

Not so simple: Mosses and ferns offer new hope for crop protection

Tiny TnpB: The next-generation genome editing tool for plants unveiled

Gelatin-based scaffolding releases meaty flavor at high temps

Scientists create a cell that precludes malignant growth

Team develops new one-step method to make multiple edits to a cell's genome

Researchers engineer poplar trees to synthesize valuable chemical squalene, normally harvested from shark livers

Medical Xpress

Tech Xplore

Science X

International competition benchmarks metagenomics software

Canadian wildfire smoke dispersal worsened by coincident cyclones, study suggests

Air pollution harms pollinators more than pests, study finds

Hexagonal metallic-mean approximants help bridge gap between quasicrystals and modulated structures

Opening the right doors: New work reveals 'jumping gene' control mechanisms

Researchers develop model to study heavy-quark recombination in quark-gluon plasma

A new species of extinct crocodile relative rewrites life on the Triassic coastline

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

Mars likely had cold and icy past, new study finds

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

New tools are needed to make water affordable, says study

Relevant PhysicsForums posts

Related Stories

Better microbial genome binning with metaBAT

Scientists develop robust method for analysis of intestinal bacteria

Lab creates bioinformatics tool for metagenome analysis

Closest relatives of Baltic Sea plankton are found in brackish North American waters

A new method to dramatically improve the sequencing of metagenomes

Microbial genomes help propose phylum name

Recommended for you

Not so simple: Mosses and ferns offer new hope for crop protection

Tiny TnpB: The next-generation genome editing tool for plants unveiled

Gelatin-based scaffolding releases meaty flavor at high temps

Scientists create a cell that precludes malignant growth

Team develops new one-step method to make multiple edits to a cell's genome

Researchers engineer poplar trees to synthesize valuable chemical squalene, normally harvested from shark livers

Newsletter sign up

Donate and enjoy an ad-free experience