What's a knot—and what's not—in genomic mapping
While DNA sequencing provides precise, nucleotide-by-nucleotide genomic information, genome mapping provides a bigger-picture perspective of sequenced DNA that can provide valuable structural information. Like mapping roads to depict a city's structural information without needing to detail each home or business, genome mapping can be a powerful tool for understanding variations of large pieces of rearranged or altered DNA.
More technically, genome mapping is used to obtain large-scale genomic information with a resolution of around 2000 base pairs (bp), as opposed to the single-base resolution of sequencing. Genome mapping complements DNA sequencing, offering insight into huge, intact molecules between 150,000 and 1 million bp in length. Obtaining measurements of such large segments is not without its challenges, but new research into the physics of nanochannel mapping published this week in the journal Biomicrofluidics, may help overcome a (literal) knot in the process and advance genome mapping technology.
A team of researchers from the University of Minnesota partnered with BioNano Genomics, a company commercializing genome mapping in nanochannels, to understand the basic physics that underlies the mapping, and use that understanding to improve the technology. BioNano Genomics maps genomes by encoding DNA with sequence-specific, fluorescent labels before injecting it into nanochannels that cause the molecule to stretch out. The structural mapping information is read from the stretched DNA.
DNA knots, however, would put a kink in this method as the molecular tangle could be read incorrectly as a structural variation in the genome sequence. To better understand these nanoknots, the group uses computer simulations to model nanochannel configurations of DNA and compares the predictions to measurement-based characterizations.
"We looked at the probability that the DNA would form a knot inside the channel and predicted the size of the knots," said Kevin Dorfman, a member of the research team and lead author of the work. "This is important in mapping because knots could be incorrectly characterized as changes in DNA sequence structure when, in fact, they are just rearrangements of the DNA within the channel."
This line of research posed several challenges. The probability of forming knots is very low, and the molecules used in genome mapping are very large, requiring the team to come up with a computational pipeline capable of simulating this system at both the resolution of the knots as well as the DNA segment as a whole.
"Previous work on DNA knotting in these types of nanochannels looked at molecules that were almost an order of magnitude smaller than the ones in our study," Dorfman said. Another challenge the team faced was handling the terabytes of data from the millions of DNA chains that were generated to get meaningful statistics.
The goal of this research was to see whether or not the model's predictions of knotting were consistent with bright spots observed during experiments that could be knots on the DNA molecule.
"We found that experimental results are not consistent with equilibrium statistical mechanics, meaning that the knots in the experiments may not actually be knots—while the way the data were processed in the experiments suggests many potential knotting events, we cannot definitively identify these events as knots," Dorfman said.
To address the discrepancy between experiments and simulations, the group will have to return to experiments, collecting dynamic data from the movement of the knots.
"The dynamic information can give us very important insights about the structure of the DNA in the channel, and potentially allow us to tell if the knots are, indeed, knots," Dorfman said.
Since knot formation is very rare, acquiring huge data sets, screening them to locate possibly knotted DNA, then analyzing those DNA molecules in detail is necessary.
The work did reveal, however, that these knots are not an intrinsic problem in genome mapping. If knot formation was frequent, this would make the processing of genome mapping data much more challenging. If the apparent knots in the experiments come from some other sources, then they can be removed by changing other parts of the experimental protocol.