NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, Heilongjiang 150028, China.
College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China.
Brief Bioinform. 2021 Mar 22;22(2):1442-1450. doi: 10.1093/bib/bbab042.
Since the first report of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in December 2019, the COVID-19 pandemic has spread rapidly worldwide. Due to the limited virus strains, few key mutations that would be very important with the evolutionary trends of virus genome were observed in early studies. Here, we downloaded 1809 sequence data of SARS-CoV-2 strains from GISAID before April 2020 to identify mutations and functional alterations caused by these mutations. Totally, we identified 1017 nonsynonymous and 512 synonymous mutations with alignment to reference genome NC_045512, none of which were observed in the receptor-binding domain (RBD) of the spike protein. On average, each of the strains could have about 1.75 new mutations each month. The current mutations may have few impacts on antibodies. Although it shows the purifying selection in whole-genome, ORF3a, ORF8 and ORF10 were under positive selection. Only 36 mutations occurred in 1% and more virus strains were further analyzed to reveal linkage disequilibrium (LD) variants and dominant mutations. As a result, we observed five dominant mutations involving three nonsynonymous mutations C28144T, C14408T and A23403G and two synonymous mutations T8782C, and C3037T. These five mutations occurred in almost all strains in April 2020. Besides, we also observed two potential dominant nonsynonymous mutations C1059T and G25563T, which occurred in most of the strains in April 2020. Further functional analysis shows that these mutations decreased protein stability largely, which could lead to a significant reduction of virus virulence. In addition, the A23403G mutation increases the spike-ACE2 interaction and finally leads to the enhancement of its infectivity. All of these proved that the evolution of SARS-CoV-2 is toward the enhancement of infectivity and reduction of virulence.
自 2019 年 12 月首次报告严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)以来,COVID-19 大流行已在全球范围内迅速蔓延。由于病毒株有限,早期研究中观察到的病毒基因组关键突变很少,这些突变与病毒进化趋势非常重要。在这里,我们下载了 2020 年 4 月之前 GISAID 上的 1809 条 SARS-CoV-2 株序列数据,以鉴定这些突变引起的突变和功能改变。总共鉴定出 1017 个非同义突变和 512 个同义突变,与参考基因组 NC_045512 比对,在刺突蛋白的受体结合域(RBD)中均未观察到。平均而言,每个毒株每月可能会产生约 1.75 个新突变。目前的突变可能对抗体的影响较小。尽管它在全基因组中显示出纯化选择,但 ORF3a、ORF8 和 ORF10 处于正选择之下。只有 36 个突变发生在 1%以上的病毒株中,进一步分析以揭示连锁不平衡(LD)变体和优势突变。结果,我们观察到五个优势突变,涉及三个非同义突变 C28144T、C14408T 和 A23403G,以及两个同义突变 T8782C 和 C3037T。这些五个突变发生在 2020 年 4 月几乎所有的病毒株中。此外,我们还观察到两个潜在的优势非同义突变 C1059T 和 G25563T,它们发生在 2020 年 4 月大多数病毒株中。进一步的功能分析表明,这些突变大大降低了蛋白质稳定性,这可能导致病毒毒力显著降低。此外,A23403G 突变增加了刺突-ACE2 相互作用,最终导致其感染力增强。所有这些都证明 SARS-CoV-2 的进化方向是增强感染力和降低毒力。