Huang Liang, Liu Kui, Ma Ke, Tian Yuan, Qin Yu, Sun Haiyin, Ding Wencheng, Gui Lingli, Wu Peng
Cancer Biology Research Center (Key Laboratory of the Ministry of Education), Tongji Medical College, Tongji Hospital, Huazhong University of Science and Technology, Wuhan, China.
Department of Hematology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
Genes Dis. 2020 Dec;7(4):567-577. doi: 10.1016/j.gendis.2020.05.006. Epub 2020 Jun 2.
As severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continues to disperse globally with worrisome speed, identifying amino acid variations in the virus could help to understand the characteristics of it. Here, we studied 489 SARS-CoV-2 genomes obtained from 32 countries from the Nextstrain database and performed phylogenetic tree analysis by clade, country, and genotype of the surface spike glycoprotein (S protein) at site 614. We found that virus strains from mainland China were mostly distributed in Clade B and Clade undefined in the phylogenetic tree, with very few found in Clade A. In contrast, Clades A2 (one case) and A2a (112 cases) predominantly contained strains from European regions. Moreover, Clades A2 and A2a differed significantly from those of mainland China in age of infected population ( = 0.0071, mean age 40.24 to 46.66), although such differences did not exist between the US and mainland China. Further analysis demonstrated that the variation of the S protein at site 614 (QHD43416.1: p.614D>G) was a characteristic of stains in Clades A2 and A2a. Importantly, this variation was predicted to have neutral or benign effects on the function of the S protein. In addition, global quality estimates and 3D protein structures tended to be different between the two S proteins. In summary, we identified different genomic epidemiology among SARS-CoV-2 strains in different clades, especially in an amino acid variation of the S protein at 614, revealing potential viral genome divergence in SARS-CoV-2 strains.
随着严重急性呼吸综合征冠状病毒2(SARS-CoV-2)继续以令人担忧的速度在全球传播,识别该病毒中的氨基酸变异有助于了解其特征。在此,我们研究了从Nextstrain数据库中获取的来自32个国家的489个SARS-CoV-2基因组,并按进化枝、国家以及表面刺突糖蛋白(S蛋白)614位点的基因型进行了系统发育树分析。我们发现,来自中国大陆的病毒株在系统发育树中大多分布在B进化枝和未定义进化枝,在A进化枝中很少发现。相比之下,A2进化枝(1例)和A2a进化枝(112例)主要包含来自欧洲地区的毒株。此外,A2和A2a进化枝在感染人群年龄方面与中国大陆的进化枝有显著差异(P = 0.0071,平均年龄40.24至46.66),尽管美国和中国大陆之间不存在此类差异。进一步分析表明,614位点的S蛋白变异(QHD43416.1:p.614D>G)是A2和A2a进化枝毒株的一个特征。重要的是,预计这种变异对S蛋白的功能具有中性或良性影响。此外,两种S蛋白的整体质量评估和三维蛋白质结构往往有所不同。总之,我们在不同进化枝的SARS-CoV-2毒株中发现了不同的基因组流行病学特征,尤其是S蛋白614位点的氨基酸变异,揭示了SARS-CoV-2毒株中潜在的病毒基因组差异。