Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano, Milano, Italia.
PLoS One. 2024 Sep 19;19(9):e0307873. doi: 10.1371/journal.pone.0307873. eCollection 2024.
Epitopes are specific structures in antigens that are recognized by the immune system. They are widely used in the context of immunology-related applications, such as vaccine development, drug design, and diagnosis / treatment / prevention of disease. The SARS-CoV-2 virus has represented the main point of interest within the viral and genomic surveillance community in the last four years. Its ability to mutate and acquire new characteristics while it reorganizes into new variants has been analyzed from many perspectives. Understanding how epitopes are impacted by mutations that accumulate on the protein level cannot be underrated.
With a focus on Omicron-named SARS-CoV-2 lineages, including the last WHO-designated Variants of Interest, we propose a workflow for data retrieval, integration, and analysis pipeline for conducting a database-wide study on the impact of lineages' characterizing mutations on all T cell and B cell linear epitopes collected in the Immune Epitope Database (IEDB) for SARS-CoV-2.
Our workflow allows us to showcase novel qualitative and quantitative results on 1) coverage of viral proteins by deposited epitopes; 2) distribution of epitopes that are mutated across Omicron variants; 3) distribution of Omicron characterizing mutations across epitopes. Results are discussed based on the type of epitope, the response frequency of the assays, and the sample size. Our proposed workflow can be reproduced at any point in time, given updated variant characterizations and epitopes from IEDB, thereby guaranteeing to observe a quantitative landscape of mutations' impact on demand.
A big data-driven analysis such as the one provided here can inform the next genomic surveillance policies in combatting SARS-CoV-2 and future epidemic viruses.
抗原上的表位是被免疫系统识别的特定结构。它们在免疫学相关应用中得到广泛应用,如疫苗开发、药物设计以及疾病的诊断/治疗/预防。在过去的四年中,SARS-CoV-2 病毒一直是病毒和基因组监测界关注的焦点。它在重组为新变体时能够突变并获得新特性,这一点已经从多个角度进行了分析。了解表位如何受到蛋白质水平上积累的突变的影响是非常重要的。
我们专注于以奥密克戎命名的 SARS-CoV-2 谱系,包括世界卫生组织最后指定的感兴趣变体,提出了一种数据检索、整合和分析的工作流程,用于对所有 T 细胞和 B 细胞线性表位进行全数据库研究,这些表位是在 SARS-CoV-2 的免疫表位数据库(IEDB)中收集的,涵盖了谱系特征性突变的影响。
我们的工作流程使我们能够展示关于 1)病毒蛋白被已发表表位覆盖的情况;2)奥密克戎变体中突变的表位分布;3)奥密克戎特征性突变在表位中的分布的新颖的定性和定量结果。结果根据表位类型、测定的反应频率和样本量进行讨论。只要有更新的变体特征和来自 IEDB 的表位,我们提出的工作流程可以随时重现,从而保证按需观察突变对影响的定量情况。
像这里提供的这种大数据驱动的分析可以为抗击 SARS-CoV-2 和未来流行病毒的下一次基因组监测政策提供信息。