Credit: CC0 Public Domain

Kadir Dede and Dr. Enno Ohlebusch at Ulm University in Germany have devised a method for constructing pan-genome subgraphs at different granularities without having to wait hours and days on end for the software to process the entire genome. Scientists will now be able to create visualizations of pan-genomes on different scales much more rapidly.

The , "Dynamic construction in pan- structures," was published in De Gruyter's open access journal Open Computer Science.

In order to analyze specific parts of a genome, scientists must be able to "see" the parts they are investigating, and this requires a large amount of processing power and time. The Computational Pan-Genomics Consortium encourages researchers to ensure that all information within a data structure is easily accessible for by visualization support on different scales. However, a pan-genome graph can have thousands to millions of nodes, which are not very easy for human eyes to visualize.

In an experiment, Dede and Ohlebusch used 10 human genomes and computed a graph that contains part of the large repetitive central exon of the human MUC5AC gene. Formerly, researchers had to create an entire index structure of the genomes, which takes about 8.5 hours and requires 38.5 GB of memory. Using the method developed by Dede and Ohlebusch, the researcher simply has to compute two bit-vectors (on which the construction of the subgraph is based) and the subgraph

More information: Kadir Dede et al. Dynamic construction of pan-genome subgraphs, Open Computer Science (2020). DOI: 10.1515/comp-2020-0018

Provided by De Gruyter