Department of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha 'O' Anusandhan (Deemed to be University), Bhubaneswar, Odisha, India.
Department of Electronics and Communication Engineering, Jaypee Institute of Information Technology, Noida, Uttar Pradesh, India.
Infect Genet Evol. 2021 Aug;92:104823. doi: 10.1016/j.meegid.2021.104823. Epub 2021 Apr 2.
The surge of SARS-CoV-2 has created a wave of pandemic around the globe due to its high transmission rate. To contain this virus, researchers are working around the clock for a solution in the form of vaccine. Due to the impact of this pandemic, the economy and healthcare have immensely suffered around the globe. Thus, an efficient vaccine design is the need of the hour. Moreover, to have a generalised vaccine for heterogeneous human population, the virus genomes from different countries should be considered. Thus, in this work, we have performed genome-wide analysis of 10,664 SARS-CoV-2 genomes of 73 countries around the globe in order to identify the potential conserved regions for the development of peptide based synthetic vaccine viz. epitopes with high immunogenic and antigenic scores. In this regard, multiple sequence alignment technique viz. Clustal Omega is used to align the 10,664 SARS-CoV-2 virus genomes. Thereafter, entropy is computed for each genomic coordinate of the aligned genomes. The entropy values are then used to find the conserved regions. These conserved regions are refined based on the criteria that their lengths should be greater than or equal to 60 nt and their corresponding protein sequences are without any stop codons. Furthermore, Nucleotide BLAST is used to verify the specificity of the conserved regions. As a result, we have obtained 17 conserved regions that belong to NSP3, NSP4, NSP6, NSP8, RdRp, Helicase, endoRNAse, 2'-O-RMT, Spike glycoprotein, ORF3a protein, Membrane glycoprotein and Nucleocapsid protein. Finally, these conserved regions are used to identify the T-cell and B-cell epitopes with their corresponding immunogenic and antigenic scores. Based on these scores, the most immunogenic and antigenic epitopes are then selected for each of these 17 conserved regions. Hence, we have obtained 30 MHC-I and 24 MHC-II restricted T-cell epitopes with 14 and 13 unique HLA alleles and 21 B-cell epitopes for the 17 conserved regions. Moreover, for validating the relevance of these epitopes, the binding conformation of the MHC-I and MHC-II restricted T-cell epitopes are shown with respect to HLA alleles. Also, the physico-chemical properties of the epitopes are reported along with Ramchandran plots and Z-Scores and the population coverage is shown as well. Overall, the analysis shows that the identified epitopes can be considered as potential candidates for vaccine design.
由于高传播率,SARS-CoV-2 的激增在全球范围内引发了一波大流行。为了遏制这种病毒,研究人员正在争分夺秒地寻找疫苗形式的解决方案。由于这场大流行的影响,全球经济和医疗保健受到了巨大影响。因此,高效的疫苗设计是当务之急。此外,为了针对异质人群开发通用疫苗,应该考虑来自不同国家的病毒基因组。因此,在这项工作中,我们对来自全球 73 个国家的 10664 个 SARS-CoV-2 基因组进行了全基因组分析,以确定用于开发基于肽的合成疫苗的潜在保守区域,即具有高免疫原性和抗原性评分的表位。在这方面,使用多重序列比对技术 Clustal Omega 对齐 10664 个 SARS-CoV-2 病毒基因组。然后,为对齐基因组的每个基因组坐标计算熵。然后使用熵值找到保守区域。根据长度应大于或等于 60nt 且相应蛋白质序列不含任何终止密码子的标准,对这些保守区域进行细化。此外,使用核苷酸 BLAST 验证保守区域的特异性。结果,我们获得了 17 个属于 NSP3、NSP4、NSP6、NSP8、RdRp、Helicase、endoRNAse、2'-O-RMT、Spike 糖蛋白、ORF3a 蛋白、Membrane 糖蛋白和 Nucleocapsid 蛋白的保守区域。最后,使用这些保守区域来识别 T 细胞和 B 细胞表位及其相应的免疫原性和抗原性评分。根据这些评分,然后为这 17 个保守区域中的每个保守区域选择最具免疫原性和抗原性的表位。因此,我们为 17 个保守区域获得了 30 个 MHC-I 和 24 个 MHC-II 限制性 T 细胞表位,其中包含 14 个和 13 个独特的 HLA 等位基因,以及 21 个 B 细胞表位。此外,为了验证这些表位的相关性,显示了 MHC-I 和 MHC-II 限制性 T 细胞表位与 HLA 等位基因的结合构象。还报告了表位的物理化学性质以及 Ramchandran 图和 Z 分数,并显示了种群覆盖率。总体而言,分析表明鉴定的表位可以被认为是疫苗设计的潜在候选物。