New DNA analysis provides first accurate tuberculosis genome

New DNA analysis provides first accurate tuberculosis genome
Developing Bact-Builder. a Pipeline overview. Bact-Builder takes raw fast5 sequencing data, files, assembles, generates a consensus, and polishes bacterial genomes. b Heatmap comparison of genome sizes of four de novo long read assemblers from laboratory stocks of H37Rv sequenced in triplicate (H37Rv.1-3). The sequence coverage sampled for each analysis is shown in each row on the Y axis. Boxes marked by an X indicate that the assemblies did not pass the Trycycler stage because they could not be reconciled with the other assemblies. * Indicates that 3 out of 4 assemblers could not be reconciled, necessitating that Trycycler was run with only 1 assembler. c Three replicates of laboratory stocks of H37Rv (H37Rv.1-3), showed variability in size depending on assembler used, and consistent sizes when Trycycler was followed by polishing (Bact-Builder output). Dotted line indicates the size of the established H37Rv reference. Data are plotted as means ± SD. d Heatmap of hierarchical clustering of the distance using Euclidean average linkage clustering of differences between all assemblies for H37Rv.1, the Bact-Builder output and the published reference (H37Rv ref) determined by DNAdiff. e Anvi’o pangenome comparing gene clusters in the reference (H37Rv ref) and H37Rv.1 individual assemblies, Trycycler output and the Bact-Builder output. Credit: Nature Communications (2022). DOI: 10.1038/s41467-022-34853-x

Researchers have developed a novel genome assembly tool that could spur the development of new treatments for tuberculosis and other bacterial infections.

The new tool, which has created an improved map of one tuberculosis strain, should do likewise for other strains and other types of bacteria, according to researchers whose findings appeared in Nature Communications.

Mycobacterium tuberculosis, the bacteria responsible for the disease tuberculosis, infects about a quarter of the world's population and killed 1.6 million people in 2021, according to World Health Organization. Current medical interventions are limited to a century-old vaccine that reduces infection risk by 20 percent and four to six months of strong antibiotics that sometimes prove ineffective.

"The key to beating this disease is to understand it, and the key to understanding it lies in its DNA," said David Alland, the senior author of the study who is chief of the Division of Infectious Diseases at Rutgers New Jersey Medical School and director of the school's Public Health Research Institute. "We hope our new pipeline provides researchers around the world with the information they need to create faster, more effective treatments and, ideally, a fully effective vaccine."

Scientists first sequenced the genome of one tuberculosis strain—H37Rv—in 1998, but they never could generate the sort of complete and accurate sequence that would maximize their chances of eradicating the disease—until now.

The new pipeline, dubbed Bact-Builder, combines common open-source genome assembly programs into a novel and easy-to-use tool which is freely available on GitHub.

Scientists today typically sequence new bacterial genomes by cutting large pieces of DNA into small, quick-to-scan fragments and then using a reference sequence such as H37Rv to align all the resulting pieces of data properly. However, assembling genomes without a reference, as Bact-Builder does with data from MinION sequencers, allows researchers to identify genes present in clinical strains that may not be present in the reference.

The tuberculosis sequence created by Bact-Builder contains approximately 6,400 thousand more pieces of information (base pairs) than the old reference and, more importantly, identifies gene new genes and gene fragments missing in the old reference.

"Just publishing a fully accurate genome for the H37Rv reference strain, which is used in hundreds of studies a year, should significantly help research," Alland said.

Having an easy way to sequence all strains accurately is even more important, Alland said, "because strain comparison should answer many vital questions such as why some strains are more contagious than others. Why do some strains cause more serious disease? Why are some strains more difficult to cure? The answers to all these questions, which could help us devise better treatments and vaccines, are in the , but you need an accurate way to find them."

More information: Poonam Chitale et al, A comprehensive update to the Mycobacterium tuberculosis H37Rv reference genome, Nature Communications (2022). DOI: 10.1038/s41467-022-34853-x


Journal information: Nature Communications

Provided by Rutgers University

Citation: New DNA analysis provides first accurate tuberculosis genome (2022, December 16) retrieved 1 March 2024 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Extensive sequence divergence found between reference genomes of two zebrafish strains


Feedback to editors