New software automates and improves phylogenomics from next-generation sequencing data
To reconstruct phylogenetic trees from next-generation sequencing data using traditional methods requires a time-consuming combination of bioinformatic procedures including genome assembly, gene prediction, orthology identification and multiple alignment. As a consequence, more recently, scientists have relied on a simpler method where short sequence reads from each species are aligned directly to the genome sequence of a single reference sequence.
The authors, Bertels, et. al., in the advanced online edition of Molecular Biology and Evolution, not only show that this simpler method can lead to significant errors and biases in phylogeny reconstruction, but have also developed a new online tool called REALPHY (reference sequence alignment based phylogeny builder) that automatically reconstructs evolutionary trees from data generated by next-generation sequencing data in a way that avoids these errors and biases. Applying this new method to several collections of bacterial genomes, the authors show that the method is at least as accurate, and often more accurate, than traditional methods.
"We believe REALPHY will make it easy for any researcher to obtain accurate phylogenies from next-generation sequencing data," said corresponding author Frederic Bertels of the University of Basel, Switzerland.
The software is simple enough for biologists without much bioinformatics expertise to use. REALPHY is available through a webserver, allowing for the fast and automated generation of multiple sequence alignments from a variety of genome sequence data formats (e.g., Illumina sequences, draft genomes, fully sequenced genomes), and the automated reconstruction of phylogenies from these alignments.