Credit: Pixabay/CC0 Public Domain

Researchers from Skoltech and their colleagues have optimized data analysis for a common method of studying the 3-D structure of DNA in single cells of a Drosophila fly. The new approach allows the scientists to peek with greater confidence into individual cells to study the unique ways DNA is packaged there and to get closer to understanding the underlying mechanisms of this crucial process. The paper was published in the journal Nature Communications.

The reason a roughly two-meter-long strand of DNA fits into the tiny nucleus of a human cell is because chromatin, a complex of DNA and proteins, packages it into compact but very complex structures. To study the way DNA is packaged, researchers across the world have developed so-called chromosome conformation capture (3C) techniques, the most efficient of which is called Hi-C. Hi-C essentially catalogs all interacting fragments of a DNA strand via high-throughput sequencing.

Therein lies the problem, however: to work, Hi-C needs tens of micrograms of DNA, which means millions of cells, each with its unique spatial organization of chromatin, have to be averaged to get a snapshot that will inevitably miss some peculiarities of DNA packaging in . Much like the '' does not really exist, conventional Hi-C cannot tell you which of the multitudes of interactions actually happen in the same cell. Furthermore, this snapshot will hardly be useful in unraveling the physical processes that led to a particular 3-D structure of chromatin.

"We see certain structures, such as so-called Topologically Associating Domains, or TADs, in averaged contact maps, but we do not know whether they are artifacts of averaging or indeed exist in individual cells. Moreover, we know that cells even in one tissue may be quite diverse in terms of gene expression—so a natural question arises whether this diversity also exists on the structural level," says Mikhail Gelfand, Skoltech Vice President for Biomedical Research and a coauthor of the new paper.

To overcome this hurdle and make the Hi-C process more suitable for single cells, researchers from several institutes advanced a technique called single-cell Hi-C. The Skoltech team, led by Gelfand and assistant professor at the Skoltech Center of Life Sciences Ekaterina Khrameeva, took a challenge to optimize data processing for single-cell Hi-C and uncover the fundamental properties of Drosophila cells.

Their colleagues from the Institute of Gene Biology RAS and Lomonosov Moscow State University in collaboration with researchers from French-Russian Interdisciplinary Scientific Center J.-V. Poncelet optimized the snHi-C procedure to make it suitable for experiments with Drosophila cells.

For the technique to work, the teams had to start with the same Hi-C steps of chemically 'freezing' the chromatin in place, strategically cutting the DNA and reassembling it so that fragments that were spatially close end up stitched together. But then, instead of using the DNA in bulk, they amplified the miniscule amounts of DNA from a single cell in each well using a polymerase from a phi29 bacteriophage. This phi29 polymerase is widely used in DNA amplification methods thanks in part to its ability to generate a lot of DNA from the tiniest of templates and to make significantly fewer errors than other commonly used polymerases.

However, it turned out that the handy DNA polymerase, while less error-prone, can still make random 'hops' between DNA molecules, creating artificial 'links' that Hi-C algorithms cannot distinguish from real interactions. So the researchers had to come up with an authenticity test, weeding out the random hops from real traces of interacting fragments.

They used their new technique on Drosophila cells to try and find out whether the fundamental ways of chromatin folding are the same across organisms. Earlier studies in mammalian cells pointed to the existence of TADs in average contact maps from Hi-C analysis, but not in individual cells. However, in Drosophila, single-cell data show that there are TADs in each particular cell. More research is needed to elucidate the biological mechanism that forms these stable domains, but the researchers suggest two types of models for these TADs. One implies that Drosophila chromatin is 'sticky' in a particular way, with different regions having different affinity to form contacts. The other, so-called loop extrusion mechanism, posits that large protein complexes create loops in the strand, bringing distant regions close together and creating a larger-scale structure.

"Perhaps, one of the most interesting questions to ask is whether chromatin folding rules are similar between different species of living organisms. Having single-cell Hi-C for the of Drosophila, we noticed that the genome of this insect is folded into domains, similar to the domains observed in single mammalian . However, these structures are much more ordered than in mammals," Aleksandra Galitsyna, Ph.D. student at Skoltech and one of the paper's first co-authors, notes.

"We will continue studying chromatin architecture and dig into the mechanisms of loop and TAD formation. There are lots of unanswered questions in this area. We already know that these mechanisms might differ between some organisms, but what is the full picture of chromatin folding evolution? If we want to understand it at a sufficient scale, we would need to bridge gaps between well-studied organisms by resolving chromatin structure in the weird ones. To do that, we are already working on sponges, yeasts, and amoeba," Ekaterina Khrameeva says.

She adds that the team is also interested in how changes in chromatin organization might be associated with disease, organism development and aging. "Assuming that architecture is tightly linked to gene expression, answering these questions might unravel the regulatory prerequisites of human development, aging, and disease," Khrameeva notes.

More information: Sergey V. Ulianov et al. Order and stochasticity in the folding of individual Drosophila genomes, Nature Communications (2021). DOI: 10.1038/s41467-020-20292-z

Journal information: Nature Communications