Mathematics explains why Crispr-Cas9 sometimes cuts the wrong DNA
The discovery of the Cas9 protein has simplified gene editing, and may even make it possible to eliminate many hereditary diseases in the near future. Using Cas9, researchers have the ability to cut DNA in a cell to correct mutated genes, or paste new pieces of genetic material into the newly opened spot. Initially, the Crispr-Cas9 system seemed to be extremely accurate. However, it is now apparent that Cas9 sometimes also cuts other DNA sequences similar to the sequences it was programmed to target. Scientists at Delft University of Technology have developed a mathematical model that explains why Cas9 cuts some DNA sequences while leaving others alone.
The Crispr-Cas9 system is a defence mechanism that protects bacteria from viruses. If a virus enters a bacterium but does not take over the cell, the defence system cuts out some genetic material from the virus and stores it in the bacterium's own genome. The built-in viral DNA acts as a genetic memory. If the same virus attacks the bacterium (or its descendants), it quickly recognises the attacker and can send out Cas9 proteins to track it down. Using viral RNA as a sort of "cheat sheet," the protein hunts for hostile DNA in the cell. If it finds a match, the Crispr-Cas9 system then cuts the viral DNA, incapacitating the threat.
Scientists initially thought that Crispr-Cas9 only cleaves a piece of DNA if it exactly matches the cheat sheet of RNA that it carries. However, that assumption has now been proven wrong. The protein sometimes cuts DNA sequences that resemble the material it is looking for, but that contain a number of different letters. According to researcher Martin Depken of Delft University of Technology, cutting such slightly differing sequences is very logical from an evolutionary point of view. "Viruses mutate constantly, and can therefore have a different genetic make-up than what Cas9 is looking for," he says. "By also cutting DNA sequences that are slightly different, the Crispr-Cas9 system can track the evolution of a virus and better protect the bacterium against its foes."
But in this case, what is good for bacteria is bad for humans. If we want to use Cas9 to edit genes from DNA, it is imperative that no genes are cut other than the ones researchers target. Destroying other genetic material can have dire consequences.
Experiments have shown that Crispr-Cas9 is more likely to cut certain non-matching sequences than others. Scientists from the research group of Martin Depken, led by Ph.D. student Misha Klein, wondered what the underlying physics were that determine this preference. Depken says it's all about the energy it costs to make base pairs that deviate from the RNA template.
"When Cas9 checks if a DNA sequence is a match, it starts at one end of the strand," Depken explains. "Then, it checks all of the letters of the strand in turn. For each match, Cas9 is rewarded with energy, while any mismatch costs energy. The more errors a DNA sequence contains, and the closer they are to the start of sequence, the more likely it is that the protein will refrain from cutting. Instead, it will unbind from the DNA and continue its search for a piece of genetic material that better matches its RNA template.
According to Depken, the simple mathematical model developed by his group predicts existing data about Cas9's cutting behaviour surprisingly well. If an error is situated at the end of the sequence, the protein will likely have amassed enough energy to overcome that hurdle, which increases the probability of cutting. The model also explains why Cas9 refrains from cutting when it encounters a mismatch at the beginning of a sequence, or when two mismatches are close together.
When it comes to the probability that a DNA sequence will be cut, the physical properties of the Cas9 protein itself also play a role. Depken and his colleagues are now looking to incorporate this variable into their model. Ultimately, the model should lead to better predictions of the errors that Cas9 is likely to make. "Sometimes, there is a choice in the exact location to cut when fixing a gene, and our model will help determine which locations are the best to target," says Depken. The physical understanding provided by the model can also help efforts to avoid life-threatening mistakes while editing DNA.