Study tracks evolution of SARS-CoV-2 virus mutations
Since COVID-19 began its menacing march across Wuhan, China, in December 2019, and then across the world, the SARS-CoV-2 virus has taken a "whatever works" strategy to ensure its replication and spread. But in a new study published in Evolutionary Bioinformatics, University of Illinois researchers and students show the virus is honing the tactics that may make it more successful and more stable.
A group of graduate students in a spring-semester Bioinformatics and Systems Biology class at Illinois tracked the mutation rate in the virus's proteome—the collection of proteins encoded by genetic material—through time, starting with the first SARS-CoV-2 genome published in January and ending more than 15,300 genomes later in May.
The team found some regions still actively spinning off new mutations, indicating continuing adaptation to the host environment. But the mutation rate in other regions showed signs of slowing, coalescing around single versions of key proteins.
"That is bad news. The virus is changing and changing, but it is keeping the things that are most useful or interesting for itself," says Gustavo Caetano-Anolles, professor of bioinformatics in the Department of Crop Sciences at Illinois and senior author on the study.
Importantly, however, the stabilization of certain proteins could be good news for the treatment of COVID-19.
According to first author Tre Tomaszewski, a doctoral student in the School of Information Sciences at Illinois, "In vaccine development, for example, you need to know what the antibodies are attaching to. New mutations could change everything, including the way proteins are constructed, their shape. An antibody target could go from the surface of a protein to being folded inside of it, and you can't get to it anymore. Knowing which proteins and structures are sticking around will provide important insights for vaccines and other therapies."
The research team documented a general slowdown in the virus's mutation rate starting in April, after an initial period of rapid change. This included stabilization within the spike protein, those pokey appendages that give coronaviruses their crowned appearance.
Within the spike, the researchers found that an amino acid at site 614 was replaced with another (aspartic acid to glycine), a mutation that took over the entire virus population during March and April.
"The spike was a completely different protein at the very beginning than it is now. You can barely find that initial version now," Tomaszewski says.
The spike protein, which is organized into two main domains, is responsible for attaching to human cells and helping inject the virus's genetic material, RNA, inside to be replicated. The 614 mutation breaks an important bond between distinct domains and protein subunits in the spike.
"For some reason, this must help the virus increase its spread and infectivity in entering the host. Or else the mutation wouldn't be kept," Caetano-Anolles says.
The 614 mutation was associated with increased viral loads and higher infectivity in a previous study, with no effect on disease severity. Yet, in another study, the mutation was linked with higher case fatality rates. Tomaszewski says although its role in virulence needs confirmation, the mutation clearly mediates entry into host cells and therefore is critical for understanding virus transmission and spread.
Remarkably, sites within two other notable proteins also became more stable starting in April, including the NSP12 polymerase protein, which duplicates RNA, and the NSP13 helicase protein, which proofreads the duplicated RNA strands.
"All three mutations seem to be coordinated with each other," Caetano-Anolles says. "They are in different molecules, but they are following the same evolutionary process."
The researchers also noted regions of the virus proteome becoming more variable through time, which they say may give us an indication of what to expect next with COVID-19. Specifically, they found increasing mutations in the nucleocapsid protein, which packages the virus's RNA after entering a host cell, and the 3a viroporin protein, which creates pores in host cells to facilitate viral release, replication, and virulence.
The research team says these are regions to watch, because increasing non-random variability in these proteins suggests the virus is actively seeking ways to improve its spread. Caetano-Anolles explains these two proteins interfere with how our bodies combat the virus. They are the main blockers of the beta-interferon pathway that make up our antiviral defenses. Their mutation could explain the uncontrolled immune responses responsible for so many COVID-19 deaths.
"Considering this virus will be in our midst for some time, we hope the exploration of mutational pathways can anticipate moving targets for speedy therapeutics and vaccine development as we prepare for the next wave," Tomaszewski says. "We, along with thousands of other researchers sequencing, uploading, and curating genome samples through the GISAID Initiative, will continue to keep track of this virus."