(PhysOrg.com) -- A handful of muck or a bucket of water can teem with millions of microorganisms a few of which could be the next big thing when it comes to learning how to create biofuels or understanding the planets carbon cycle.
This search for the movers and shakers of the microbial world is getting easier thanks to a database of fingerprints maintained by Lawrence Berkeley National Laboratory (Berkeley Lab) scientists that surpassed one million entries earlier this year.
The database, called Greengenes, is one of the worlds largest collections of high-quality DNA sequences of 16S ribosomal RNA genes. These protein-making genes are found in all microbes, and in general each species has a unique variation. Theyre genetic IDs, the one thing that can finger a specific microbe in a crowded lineup, if you know which 16S rRNA belongs to which microbe.
Thats where Greengenes come in. Researchers from around the world can access the database online and enter 16S rRNA sequences extracted from samples of soil, water, and even intestinal bacteria. A match with a sequence in Greengenes is a giveaway that a specific microbe is in the sample. If theres not a match, perhaps a new species has been discovered.
In this way, Greengenes is fast becoming a go-to resource for scientists seeking to better understand what microbes do, their diversity, and what we can learn from them. The database launched in 2002 and now gets about 100 citations per year in scientific papers.
Our goal is to develop the highest quality reference set so scientists can use it to better understand life at the microscopic scale. We want to cover as much microbial diversity on Earth as possible, says Todd DeSantis, a scientist in Berkeley Labs Earth Sciences Division who led the development of the database under the auspices of Gary Andersens lab.
Among its many hits, Stanford University scientists used the database to discover a microorganism in San Francisco Bay sediments that plays a role in the carbon and nitrogen cycles. The scientists could see the ammonia-oxidizing archaea under the microscope, but they couldnt grow it in the lab. They extracted its DNA, sequenced it, and compared to known strains in Greengenes. It was unique, and a new organism was named: Candidatus Nitrosoarchaeum limnia SFB1.
A Cornell University-led team used Greengenes to identify microbes that efficiently convert industrial wastewater into methane. Their work could help scientists engineer microbial communities that are optimized to digest wastewater and emit methane for use as an energy source.
Elsewhere, a team from the University of Milan used the database to analyze bacterial DNA from stains on the pages of Leonardo da Vincis multi-volume Codex Atlanticus. They found matches to bacteria previously isolated from cleanrooms and human skin, which led the team to recommend new ways to protect texts from deterioration.
And a Danish team used the database to improve the treatment of a disease, called necrotizing enterocolitis, which is marked by inappropriate bacteria colonizing an infants intestines.
Expect more uses from Greengenes as it continues to grow. When scientists find a 16S rRNA gene in the course of their research, they submit its sequence to one of many gene databanks. Greengenes scours these databanks for new entries. When it finds one, it uses a computer program to compare the sequence to other 16S rRNA genes and to ensure its quality. Only the best and most complete sequences are added.
There are tens of millions of 16S-like sequences in public databases, but we only want the highest quality sequences to use as references, says DeSantis.
Explore further: Genetic molecular mechanisms of neural development identified
More information: greengenes.lbl.gov/