A model to decipher the complexity of gene regulation
How, where and when genes are expressed determine individual phenotypes. If gene expression is controlled by many regulatory elements, what, ultimately, controls them? And how does genetic variation affect them? The SysGenetiX project, led by the University of Geneva (UNIGE) in collaboration with the University of Lausanne (UNIL), Switzerland, sought to investigate these regulatory elements, as well as the manifold interactions between them and with genes, with the ultimate goal of understanding the mechanisms that render some people more predisposed to manifesting particular diseases than others.
By studying chromatin modifications (i.e. how the genome is "packaged") in the cells of about 300 individuals, scientists from Geneva and Lausanne not only identified the structure of these regulatory elements, they were also able to model how their interactions throughout the whole genome influence gene regulation and risk of disease. Their approach is now published in Science.
Emmanouil Dermitzakis, leader of the SysGenetiX project, is a specialist of the genetic variation of gene regulation. He explains the novel approach of this work: "Instead of only studying the levels of gene expression—a strategy that gives only a partial picture—we decided to focus on chromatin, which seems to be an intermediate molecular component of regulation."
Chromatin, a complex of DNA, RNA and proteins, plays important roles in protecting DNA during crucial phases of the cell cycle. Chromatin modifications therefore mediate the effects of expression factors, and eventually regulate gene expression. By measuring the activity of regulatory elements in chromatin profiles, the scientists were able to capture the levels of activity of most regulatory elements.
"We had tested our approach on more focused settings in past studies," says Olivier Delaneau, a researcher in Prof. Dermitzakis' lab and first author of this work. "This time, we wanted to study chromatin profiles of large samples to be able to understand, at population level, how genetic variation influences chromatin variability, which in turn transmits that variability to gene expression. All these data could be used to build robust models of activation mechanisms and regulatory networks, and to understand what affects whether genes are expressed."
The building blocks of our genome
The analysis of the chromatin profiles allowed the scientists to make an important discovery. "Regulatory activity appears to be organized in fully independent blocks, with series of regulatory elements on the same genomic region being all high or all low at the same time—as if regulatory elements were stuck together in genomic Lego blocks," says Alexandre Reymond, professor at the Center for Integrative Genomics, UNIL Faculty of Biology and Medicine, who co-led this work. Other geneticists had already pinpointed rather large structures—called the "topologically associating domain," or TAD—that play a role in gene regulation. However, the "blocks" here identified—named CRDs—are of much smaller size, enabling the definition of a much finer resolution map of gene expression.
To understand their function, the scientists built specific models to measure how genetic variation impacts on these structures, which increase or decrease gene activity. By encompassing several hundreds of samples, the scientists found genetic variants that not only increase or decrease gene expression, but that have the power to change the very structure of these blocks by, for instance, splitting one block into two fully separated structures. By doing so, they change the landscape of regulation, and therefore gene expression.
Acting locally for a global impact
"DNA is not a two-dimensional structure in the cell nucleus; it needs to be understood in three (or more) dimensions," says Emmanouil Dermitzakis. "According to a traditional model of gene regulation, a gene enhancer must be located near the gene, on the same genomic region. Conversely, our model shows that regulatory elements could very well be on another chromosome. Because of the nuclear 3-D structure that brings regions together, a cross-talk of regions can take place in any of our 23 chromosomes, with 'trans-regulatory hubs' affecting genes anywhere."
The geneticists were able to create statistical models showing which genetic variant influences which block of chromatin that, in turn, affects multiple genes across the genome. In addition, if gene mutations are relatively easy to observe, the same for regulatory elements—located in the non-coding DNA—is more problematic. "Indeed, as we do not understand their 'grammar,' it is difficult to identify whether mutations will have an influence, positive or negative. By pooling them together, we were able to design a method to search for rare variants in non-coding regions," says Olivier Delaneau. "For the first time, we provide a framework of the burden of complex diseases in the non-coding DNA."
Building models to decipher complexity
The work, led by Prof. Dermitzakis' and Prof. Reymond's teams with a leading collaboration by Prof Stylianos Antonarakis' team at UNIGE faculty of Medicine, constitutes a turning point in gene regulation analysis. By incorporating the complexity of the genome into a single model, the scientists provide a tree of correlations of all regulatory elements across the whole genome.
"Every node of this tree can then be analysed to summarize the effects of that node, as well as the variability of all regulatory elements below that could be relevant to a certain phenotype," says Alexandre Reymond. This structure reduces the number of hypotheses, and opens up a whole new world in the study of the effect of genetic variation in genome function. Furthermore, modeling complexity to determine how specific genetic or environmental factors contribute to somebody's risk or manifestation of a disease is exactly what "precision medicine" means.
"The more we disentangle the complexity, the easier it is to discover what we are looking for," the authors conclude.