Gayvert Kaitlyn, McKay Sheldon, Lim Wei Keat, Baum Alina, Kyratsous Christos, Copin Richard, Atwal Gurinder S
Regeneron Pharmaceuticals Inc., Tarrytown, NY, 10091, USA.
Npj Viruses. 2023 Nov 14;1(1):5. doi: 10.1038/s44298-023-00007-z.
Understanding the adaptation of SARS-CoV-2 is critical for the development of effective treatments against this exceptionally successful human pathogen. To predict the emergence of new variants that may escape host immunity or increase virulence, it is important to characterize the biological forces driving its evolution. We conducted a comprehensive population genetic study of over thirteen million SARS-CoV-2 genome sequences, collected over a timeframe of ~3 years, to investigate these forces. Our analysis revealed that during the first year of the pandemic (2020 to 2021), the SARS-CoV-2 genome was subject to strong conservation, with only 3.6% of sites under diversifying pressure in the receptor binding domain (RBD) of the Spike protein. However, we observed a sharp increase in the diversification of the RBD during 2021 (8.1% of sites under diversifying pressure up to 2022), indicating selective pressures that promote the accumulation of mutations. This period coincided with broad viral infection and adoption of vaccination worldwide, and we observed the acquisition of mutations that later defined the Omicron lineages in independent SARS-CoV-2 strains, suggesting that diversifying selection at these sites could have led to their fixation in Omicron lineages by convergent evolution. Since the emergence of Omicron, we observed a further decrease in the conservation of structural genes, including M, N, and the spike proteins (13.1% of RBD sites under diversifying pressure up to 2023), and identified new sites defining future potential emerging strains. Our results exhibit that ongoing rapid antigenic evolution continues to produce new high-frequency functional variants. Sites under selection are critical for virus fitness, and currently known T cell epitope sequences are highly conserved. Altogether, our study provides a comprehensive dynamic map of sites under selection and conservation across the entirety of the SARS-CoV-2 genome.
了解严重急性呼吸综合征冠状病毒2(SARS-CoV-2)的适应性对于开发针对这种极其成功的人类病原体的有效治疗方法至关重要。为了预测可能逃避宿主免疫或增加毒力的新变种的出现,表征驱动其进化的生物学力量很重要。我们对超过1300万个SARS-CoV-2基因组序列进行了全面的群体遗传学研究,这些序列是在约3年的时间内收集的,以研究这些力量。我们的分析表明,在大流行的第一年(2020年至2021年),SARS-CoV-2基因组受到强烈的保守作用,刺突蛋白受体结合域(RBD)中只有3.6%的位点处于多样化压力之下。然而,我们观察到2021年期间RBD的多样化急剧增加(截至2022年,8.1%的位点处于多样化压力之下),这表明存在促进突变积累的选择压力。这一时期恰逢全球范围内广泛的病毒感染和疫苗接种的采用,我们观察到在独立的SARS-CoV-2毒株中获得了后来定义奥密克戎谱系的突变,这表明这些位点的多样化选择可能通过趋同进化导致它们在奥密克戎谱系中固定下来。自奥密克戎出现以来,我们观察到包括M、N和刺突蛋白在内的结构基因的保守性进一步下降(截至2023年,RBD位点的13.1%处于多样化压力之下),并确定了定义未来潜在新兴毒株的新位点。我们的结果表明,持续的快速抗原进化继续产生新的高频功能变种。选择下的位点对病毒适应性至关重要,目前已知的T细胞表位序列高度保守。总之,我们的研究提供了一张跨越整个SARS-CoV-2基因组的选择和保守位点的全面动态图谱。