This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

New analytical tool improves genetic analysis and research accuracy

New analytical tool improves genetic analysis and research accuracy
a, Schematic of the SAHMI pipeline. SAHMI identifies taxa that are truly present in tissues using a k-mer correlation test and identifies false positives and contaminants by comparing taxa distributions to an extensive negative control reference. b, Scatter plot showing the total number of sequencing reads and species detected in each study. Blue, experimentally introduced pathogen; red, natural infection involving a human tissue. See Methods for definitions of the microorganisms/viruses. c, Box plots showing significantly increased reads assigned to bacteria when the human genome is not included as a reference during taxonomic classification. Box plots show the median (line), 25th and 75th percentiles (box) and 1.5× interquartile range (whiskers); the black dot is an outlier. Two-sided t-tests; ****P < 2 × 10−16 (native R statistical test limit); ***P = 0.005; ns, not significant. Left, n = 23; middle, n = 38; right, n = 147. d, Histogram of the number of unique k-mers per sample assigned to the known pathogens and to all detected species in the benchmark studies. e, k-mer correlation test results for the truly present Salmonella enterica (S. enterica) in ref. 8 (top panels) and Fusarium venenatum (F. venenatum), an example false positive or contaminant from the same samples (bottom panels). The left three panels exhibit correlations across samples, and the right-most shows the k-mer test across barcodes. The labels at the top indicate the Spearman correlation value. See also Supplementary Fig. 1 for correlations for all truly present taxa in each benchmark dataset. f, Scatter plot of the k-mer correlation tests for species in the benchmark studies. Each point represents an individual species. Correlations are run across samples within a study. The x axis represents the Spearman correlation values between the number of k-mers versus the number of unique k-mers. The y axis represents the correlation value between the number of k-mers versus the number of reads; colors represent correlation values between the number of reads versus the number of unique k-mers. Lines represent contour densities. g, Two-sided Fisher test result for taxa in all benchmark studies that passed or failed the k-mer correlation tests. h, Box plots showing the fraction of reads mapped per taxon with BLAST for taxa that passed or failed the k-mer correlation tests in a subset of the skin leprosy data. Box plots are as in b, two-sided Wilcoxon rank sum testing. Failed, n = 162; Passed, n = 64. i, Box plots showing the distribution of BLAST mapping scores for taxa that failed or passed the k-mer correlation tests for a subset of the skin leprosy data. Box plots are as in b, two-sided Wilcoxon rank sum testing. Failed, n = 162; Passed, n = 64. j, Similar to e but for taxa detected in cell-line experiments. Credit: Nature Computational Science (2023). DOI: 10.1038/s43588-023-00507-1

Rutgers researchers have developed an analytical tool for spotting and omitting stray DNA and RNA that contaminate genetic analyses of single-celled organisms.

Their work, which appears in Nature Computational Science, also may help laboratories avoid mismatching sequenced gene fragments from different organisms in the same sample.

The , dubbed Single-cell Analysis of Host-Microbiome Interactions, or SAHMI, can improve the accuracy of medical research—particularly research into the microbiome's effect on health—and may eventually drive that hinges upon genetic analyses of samples.

"Sample contamination happens frequently because extraneous genetic material is everywhere: flecking off patient fingers, floating through the air, lurking inside the laboratory's reagents," said Bassel Ghaddar, a dual doctoral degree candidate at Rutgers Robert Wood Johnson Medical School and lead author of the study.

"There's also a challenge arising from the algorithms we use to understand where sequenced gene segments come from," Ghaddar added. "They need to figure out whether a bit of DNA or RNA belongs to the patient or a bacterium in the microbiome or an invading virus or something else. And these algorithms can make a lot of mistakes."

After developing SAHMI, the creators made sure it worked by testing it on various datasets containing samples of human tissues with known microbial infections. They found SAHMI successfully identified and quantified the known pathogens in all the samples while filtering out contaminants and false positives.

The testing also showed that SAHMI could be used to identify microbe-associated cells and to study the spatial distribution of microbes in tissues.

The software's ability to increase result accuracy may improve the study of various tissues and diseases. Ghaddar said it would be particularly valuable in that typically harbor a large number of unknown microorganisms.

Such tissue types naturally include those that interact with the gut, skin, nose or lung microbiomes. They include many other tissue types that were once thought to be free of microbes, such as those from organs such as the pancreas and even many cancers.

With that in mind, the creators of SAHMI said it may be used to identify the microbes associated with specific diseases or to track the changes in the microbiome during disease progression. It also could be used to study the effects of drugs or other interventions on the microbiome and the impact of initial composition on susceptibility to various diseases.

The Rutgers team has already used SAHMI to examine the microbiome of pancreatic tumors and identify particular microorganisms associated with inflammation and poor survival at single-cell resolution. The researchers said they believe microorganisms may be new targets for earlier diagnosis or treatment of pancreatic cancer, the fourth leading cause of cancer death for both men and women in the United States.

"The results this technique produced in our study of pancreatic cancer provided unexpected and important new insight into tumor development while also suggesting new ways to attack tumors," said Subhajyoti De, a principal investigator at Rutgers Cancer Institute and senior author of the study. "We think it could produce similar levels of insight in many other fields of study and ultimately in normal patient care, which is why we're making it freely available via Git Hub."

More information: Bassel Ghaddar et al, Denoising sparse microbial signals from single-cell sequencing of mammalian host tissues, Nature Computational Science (2023). DOI: 10.1038/s43588-023-00507-1

Provided by Rutgers University

Citation: New analytical tool improves genetic analysis and research accuracy (2023, October 9) retrieved 30 April 2024 from https://phys.org/news/2023-10-analytical-tool-genetic-analysis-accuracy.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Researchers find tumor microbiome interactions may identify new approaches for pancreatic cancer treatment

14 shares

Feedback to editors