Khetran Saima Rehman, Mustafa Roma
Department of Life Sciences Sardar Bahadur Khan Women's University Quetta Pakistan.
JMIR Bioinform Biotechnol. 2023 Jul 14;4:e43906. doi: 10.2196/43906. eCollection 2023.
COVID-19 and Middle East Respiratory Syndrome are two pandemic respiratory diseases caused by coronavirus species. The novel disease COVID-19 caused by SARS-CoV-2 was first reported in Wuhan, Hubei Province, China, in December 2019, and became a pandemic within 2-3 months, affecting social and economic platforms worldwide. Despite the rapid development of vaccines, there have been obstacles to their distribution, including a lack of fundamental resources, poor immunization, and manual vaccine replication. Several variants of the original Wuhan strain have emerged in the last 3 years, which can pose a further challenge for control and vaccine development.
The aim of this study was to comprehensively analyze mutations in SARS-CoV-2 variants of concern (VoCs) using a bioinformatics approach toward identifying novel mutations that may be helpful in developing new vaccines by targeting these sites.
Reference sequences of the SARS-CoV-2 spike (YP_009724390) and nucleocapsid (YP_009724397) proteins were compared to retrieved sequences of isolates of four VoCs from 14 countries for mutational and evolutionary analyses. Multiple sequence alignment was performed and phylogenetic trees were constructed by the neighbor-joining method with 1000 bootstrap replicates using MEGA (version 6). Mutations in amino acid sequences were analyzed using the MultAlin online tool (version 5.4.1).
Among the four VoCs, a total of 143 nonsynonymous mutations and 8 deletions were identified in the spike and nucleocapsid proteins. Multiple sequence alignment and amino acid substitution analysis revealed new mutations, including G72W, M2101I, L139F, 209-211 deletion, G212S, P199L, P67S, I292T, and substitutions with unknown amino acid replacement, reported in Egypt (MW533289), the United Kingdom (MT906649), and other regions. The variants B.1.1.7 (Alpha variant) and B.1.617.2 (Delta variant), characterized by higher transmissibility and lethality, harbored the amino acid substitutions D614G, R203K, and G204R with higher prevalence rates in most sequences. Phylogenetic analysis among the novel SARS-CoV-2 variant proteins and some previously reported β-coronavirus proteins indicated that either the evolutionary clade was weakly supported or not supported at all by the β-coronavirus species.
This study could contribute toward gaining a better understanding of the basic nature of SARS-CoV-2 and its four major variants. The numerous novel mutations detected could also provide a better understanding of VoCs and help in identifying suitable mutations for vaccine targets. Moreover, these data offer evidence for new types of mutations in VoCs, which will provide insight into the epidemiology of SARS-CoV-2.
新型冠状病毒肺炎(COVID-19)和中东呼吸综合征是由冠状病毒引起的两种大流行性呼吸道疾病。严重急性呼吸综合征冠状病毒2(SARS-CoV-2)引起的新型疾病COVID-19于2019年12月在中国湖北省武汉市首次报告,并在2至3个月内成为大流行病,影响了全球的社会和经济平台。尽管疫苗迅速发展,但其分发仍存在障碍,包括缺乏基本资源、免疫接种不佳和手工疫苗复制。在过去3年中出现了原始武汉毒株的几种变体,这可能对控制和疫苗开发构成进一步挑战。
本研究的目的是使用生物信息学方法全面分析关注的SARS-CoV-2变体(VoC)中的突变,以识别可能有助于通过靶向这些位点开发新疫苗的新突变。
将SARS-CoV-2刺突蛋白(YP_009724390)和核衣壳蛋白(YP_009724397)的参考序列与从14个国家的4种VoC分离株中检索到的序列进行比较,以进行突变和进化分析。使用MEGA(版本6)通过邻接法进行多序列比对,并构建具有1000次重复抽样的系统发育树。使用MultAlin在线工具(版本5.4.1)分析氨基酸序列中的突变。
在这4种VoC中,在刺突蛋白和核衣壳蛋白中总共鉴定出143个非同义突变和8个缺失。多序列比对和氨基酸替换分析揭示了新的突变,包括埃及(MW533289)、英国(MT906649)和其他地区报告的G72W、M2101I、L139F、209 - 211缺失、G212S、P199L、P67S、I292T以及氨基酸替换未知的替换。以更高的传播性和致死率为特征的变体B.1.1.7(阿尔法变体)和B.1.617.2(德尔塔变体)在大多数序列中具有更高流行率的氨基酸替换D614G、R203K和G204R。新型SARS-CoV-2变体蛋白与一些先前报道的β冠状病毒蛋白之间的系统发育分析表明,β冠状病毒物种对进化分支的支持较弱或根本不支持。
本研究有助于更好地了解SARS-CoV-2及其四种主要变体的基本性质。检测到的众多新突变也可以更好地理解VoC,并有助于识别适合作为疫苗靶点的突变。此外,这些数据为VoC中的新型突变提供了证据,这将为SARS-CoV-2的流行病学提供见解。