Department of Biotechnology, Indian Institute of Technology Hyderabad, Kandi, Telangana 502285, India.
Department of Biotechnology, Indian Institute of Technology Hyderabad, Kandi, Telangana 502285, India.
Infect Genet Evol. 2021 Sep;93:104973. doi: 10.1016/j.meegid.2021.104973. Epub 2021 Jun 18.
SARS-CoV-2 is currently causing major havoc worldwide with its efficient transmission and propagation. To track the emergence as well as the persistence of mutations during the early stage of the pandemic, a comparative analysis of SARS-CoV-2 whole proteome sequences has been performed by considering manually curated 31,389 whole genome sequences from 84 countries. Among the 7 highly recurring (percentage frequency≥10%) mutations (Nsp2:T85I, Nsp6:L37F, Nsp12:P323L, Spike:D614G, ORF3a:Q57H, N protein:R203K and N protein:G204R), N protein:R203K and N protein: G204R are co-occurring (dependent) mutations. Nsp12:P323L and Spike:D614G often appear simultaneously. The highly recurring Spike:D614G, Nsp12:P323L and Nsp6:L37F as well as moderately recurring (percentage frequency between ≥1 and <10%) ORF3a:G251V and ORF8:L84S mutations have led to4 major clades in addition to a clade that lacks high recurring mutations. Further, the occurrence of ORF3a:Q57H&Nsp2:T85I, ORF3a:Q57H and N protein:R203K&G204R along with Nsp12:P323L&Spike:D614G has led to 3 additional sub-clades. Similarly, occurrence of Nsp6:L37F and ORF3a:G251V together has led to the emergence of a sub-clade. Nonetheless, ORF8:L84S does not occur along with ORF3a:G251V or Nsp6:L37F. Intriguingly, ORF3a:G251V and ORF8:L84S are found to occur independent of Nsp12:P323L and Spike:D614G mutations. These clades have evolved during the early stage of the pandemic and have disseminated across several countries. Further, Nsp10 is found to be highly resistant to mutations, thus, it can be exploited for drug/vaccine development and the corresponding gene sequence can be used for the diagnosis. Concisely, the study reports the SARS-CoV-2 antigens diversity across the globe during the early stage of the pandemic and facilitates the understanding of viral evolution.
SARS-CoV-2 目前通过高效传播和繁殖在全球范围内造成严重破坏。为了在大流行早期追踪突变的出现和持续存在,对来自 84 个国家的 31389 个全基因组序列进行了 SARS-CoV-2 全蛋白质组序列的比较分析。在 7 个高频(频率百分比≥10%)突变(Nsp2:T85I、Nsp6:L37F、Nsp12:P323L、Spike:D614G、ORF3a:Q57H、N 蛋白:R203K 和 N 蛋白:G204R)中,N 蛋白:R203K 和 N 蛋白:G204R 是共同发生(依赖)的突变。Nsp12:P323L 和 Spike:D614G 经常同时出现。高频 Spike:D614G、Nsp12:P323L 和 Nsp6:L37F 以及中度高频(频率百分比在≥1%至<10%之间)ORF3a:G251V 和 ORF8:L84S 突变导致除了缺乏高频突变的聚类之外,还出现了 4 个主要聚类。此外,ORF3a:Q57H&Nsp2:T85I、ORF3a:Q57H 和 N 蛋白:R203K&G204R 以及 Nsp12:P323L&Spike:D614G 的出现导致了 3 个额外的亚聚类。同样,Nsp6:L37F 和 ORF3a:G251V 的出现导致了一个亚聚类的出现。然而,ORF8:L84S 不会与 ORF3a:G251V 或 Nsp6:L37F 一起出现。有趣的是,ORF3a:G251V 和 ORF8:L84S 被发现与 Nsp12:P323L 和 Spike:D614G 突变无关。这些聚类在大流行早期阶段进化,并在多个国家传播。此外,Nsp10 被发现对突变高度耐受,因此可以用于药物/疫苗开发,并且可以使用相应的基因序列进行诊断。简而言之,该研究报告了大流行早期全球范围内 SARS-CoV-2 抗原的多样性,并促进了对病毒进化的理解。