Division of Immune Diversity, German Cancer Research Center (DKFZ), Heidelberg, Germany.
Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany.
PLoS One. 2021 Jul 23;16(7):e0255169. doi: 10.1371/journal.pone.0255169. eCollection 2021.
Since the first case of COVID-19 in December 2019 in Wuhan, China, SARS-CoV-2 has spread worldwide and within a year and a half has caused 3.56 million deaths globally. With dramatically increasing infection numbers, and the arrival of new variants with increased infectivity, tracking the evolution of its genome is crucial for effectively controlling the pandemic and informing vaccine platform development. Our study explores evolution of SARS-CoV-2 in a representative cohort of sequences covering the entire genome in the United States, through all of 2020 and early 2021. Strikingly, we detected many accumulating Single Nucleotide Variations (SNVs) encoding amino acid changes in the SARS-CoV-2 genome, with a pattern indicative of RNA editing enzymes as major mutators of SARS-CoV-2 genomes. We report three major variants through October of 2020. These revealed 14 key mutations that were found in various combinations among 14 distinct predominant signatures. These signatures likely represent evolutionary lineages of SARS-CoV-2 in the U.S. and reveal clues to its evolution such as a mutational burst in the summer of 2020 likely leading to a homegrown new variant, and a trend towards higher mutational load among viral isolates, but with occasional mutation loss. The last quartile of 2020 revealed a concerning accumulation of mostly novel low frequency replacement mutations in the Spike protein, and a hypermutable glutamine residue near the putative furin cleavage site. Finally, end of the year data and 2021 revealed the gradual increase to prevalence of known variants of concern, particularly B.1.1.7, that have acquired additional Spike mutations. Overall, our results suggest that predominant viral genomes are dynamically evolving over time, with periods of mutational bursts and unabated mutation accumulation. This high level of existing variation, even at low frequencies and especially in the Spike-encoding region may become problematic when super-spreader events, akin to serial Founder Events in evolution, drive these rare mutations to prominence.
自 2019 年 12 月中国武汉首次出现 COVID-19 病例以来,SARS-CoV-2 已在全球范围内传播,在不到一年半的时间里,已在全球造成 356 万人死亡。随着感染人数的急剧增加,以及具有更高传染性的新变种的出现,跟踪其基因组的进化对于有效控制疫情和为疫苗平台的开发提供信息至关重要。我们的研究通过对 2020 年全年和 2021 年初在美国涵盖整个基因组的代表性序列进行研究,探索了 SARS-CoV-2 的进化。令人惊讶的是,我们在 SARS-CoV-2 基因组中检测到许多编码氨基酸变化的累积单核苷酸变异(SNVs),其模式表明 RNA 编辑酶是 SARS-CoV-2 基因组的主要突变因子。我们报告了截至 2020 年 10 月的三个主要变体。这些变体揭示了 14 个关键突变,这些突变在 14 个不同主要特征中以各种组合出现。这些特征可能代表了 SARS-CoV-2 在美国的进化谱系,并揭示了其进化的线索,例如 2020 年夏天可能导致本土新变体的突变爆发,以及病毒分离株中突变负荷增加的趋势,但偶尔会出现突变丢失。2020 年最后一个季度,Spike 蛋白中积累了大量新的低频率替换突变,以及假定的弗林裂解位点附近的一个高度易变的谷氨酰胺残基。最后,2020 年年末的数据和 2021 年的数据显示,已知的令人担忧的变异株(特别是 B.1.1.7)的流行率逐渐增加,这些变异株获得了额外的 Spike 突变。总的来说,我们的结果表明,主要的病毒基因组随着时间的推移在不断进化,存在突变爆发和持续积累突变的时期。即使在低频率下,尤其是在 Spike 编码区域,这种高水平的现有变异可能会成为问题,因为类似于进化中的串联创始事件的超级传播事件会使这些罕见的突变凸显出来。