DNA sequences need quality time too - guidelines for quality control published

September 5, 2012
This is the cover for the latest MycoKeys issue. Credit: Pensoft Publishers

Like all sources of information, DNA sequences come in various degrees of quality and reliability. To identify, proof, and discard compromised molecular data has thus become a critical component of the scientific endeavor - one that everyone generating sequence data is assumed to carry out before using the sequences for research purposes.

"Many researchers find sequence difficult, though", says Dr. Henrik Nilsson of the University of Gothenburg and the lead author of a new article on sequence reliability, published in the Open Access journal MycoKeys. "There just isn't any straightforward document to put in their hands to give them a flying start. As a result, scientists differ in the degree to which they are aware of the need to exercise sequence quality control and in what measures they take." Previous studies have highlighted several shortcomings of publicly available - more than ten percent of the fungal DNA may be misidentified at the species level, for example.

"A second complication", adds co-author Prof. Urmas Koljalg of the University of Tartu, "is that the software available for sequence quality management tend to be very complex and resource intensive. It borders on the unfair to expect everyone to have access to, and to master, such computer environments. Fortunately, a whole lot can be done towards quality control of DNA sequences using just manual means and a web browser. The current MycoKeys paper describes these means to help those who do not have a strong background in computer science."

The article—"Five simple guidelines for establishing basic authenticity and reliability of newly generated fungal ITS sequences"—compiles principles and observations to assist the reader in the quality management of . Although focusing on , the guidelines are general and apply to most groups of organisms and genes. The guidelines target traditional DNA sequencing and are broadly applicable to datasets used in systematics, taxonomy, and ecology.

Co-author Dr. Martin Hartmann of the Swiss Federal Research Institute WSL concludes, "We hope that our guidelines will assist the readers in sharpening their datasets so that, eventually, the trend of increasing noise in the public sequence databases can be arrested. Molecular data offer so much promise that we simply cannot afford to lose accuracy to bias and artifacts."

Explore further: Human chromosome 3 is sequenced

More information: Nilsson RH, Tedersoo L, Abarenkov K, Ryberg M, Kristiansson E, Hartmann M, Schoch CL, Nylander JAA, Bergsten J, Porter TM, Jumpponen A, Vaishampayan P, Ovaskainen O, Hallenberg N, Bengtsson-Palme J, Eriksson KM, Larsson K-H, Larsson E, Kõljalg U (2012) Five simple guidelines for establishing basic authenticity and reliability of newly generated fungal ITS sequences. MycoKeys 4: 37-62. doi: 10.3897/mycokeys.4.3606

Related Stories

Human chromosome 3 is sequenced

April 27, 2006

The sequencing of human chromosome 3 at Baylor College represents the final stage of a multi-year project to sequence the human genome.

On the trail of rogue genetically modified pathogens

March 18, 2008

Bacteria can be used to engineer genetic modifications, thereby providing scientists with a tool to combat many challenges in areas from food production to drug discovery. However, this sophisticated technology can also be ...

Standards for a New Genomic Era

October 21, 2009

(PhysOrg.com) -- A team of geneticists at Los Alamos National Laboratory, together with a consortium of international researchers, has recently proposed a set of standards designed to elucidate the quality of publicly available ...

Exploring the 'last frontier' of our genome

September 23, 2011

The human genome first appeared in print in 2001. But scientists aren’t done yet. There’s part of our DNA that geneticists have yet to assemble a sequence for: the centromeres.

To get the full story you need to know the motifs

March 26, 2012

Genome sequencing alone provides researchers with only limited information on the organism works because it neither reveals how the system is regulated nor does it indicate the role of each specific DNA sequence or RNA transcript. ...

Recommended for you

Fecal mimicry found in seeds that fool dung beetles

October 6, 2015

(Phys.org)—A team of researchers with the University of Cape Town and the University of KwaZulu-Natal, both in South Africa, has found an example of a seed from a plant using mimicry to fool a beetle. In their paper published ...

A better way to read the genome

October 9, 2015

UConn researchers have sequenced the RNA of the most complicated gene known in nature, using a hand-held sequencer no bigger than a cell phone.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.