Suppr超能文献

用于预测作为潜在疫苗靶点的SARS-CoV-2 T细胞表位的集成机器学习模型

Ensemble Machine Learning Model to Predict SARS-CoV-2 T-Cell Epitopes as Potential Vaccine Targets.

作者信息

Bukhari Syed Nisar Hussain, Jain Amit, Haq Ehtishamul, Mehbodniya Abolfazl, Webber Julian

机构信息

University Institute of Computing, Chandigarh University, NH-95, Chandigarh-Ludhiana Highway, Mohali 140413, India.

Department of Biotechnology, University of Kashmir, Srinagar 190006, India.

出版信息

Diagnostics (Basel). 2021 Oct 26;11(11):1990. doi: 10.3390/diagnostics11111990.

Abstract

An ongoing outbreak of coronavirus disease 2019 (COVID-19), caused by a single-stranded RNA virus called severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has caused a worldwide pandemic that continues to date. Vaccination has proven to be the most effective technique, by far, for the treatment of COVID-19 and to combat the outbreak. Among all vaccine types, epitope-based peptide vaccines have received less attention and hold a large untapped potential for boosting vaccine safety and immunogenicity. Peptides used in such vaccine technology are chemically synthesized based on the amino acid sequences of antigenic proteins (T-cell epitopes) of the target pathogen. Using wet-lab experiments to identify antigenic proteins is very difficult, expensive, and time-consuming. We hereby propose an ensemble machine learning (ML) model for the prediction of T-cell epitopes (also known as immune relevant determinants or antigenic determinants) against SARS-CoV-2, utilizing physicochemical properties of amino acids. To train the model, we retrieved the experimentally determined SARS-CoV-2 T-cell epitopes from Immune Epitope Database and Analysis Resource (IEDB) repository. The model so developed achieved accuracy, AUC (Area under the ROC curve), Gini, specificity, sensitivity, F-score, and precision of 98.20%, 0.991, 0.994, 0.971, 0.982, 0.990, and 0.981, respectively, using a test set consisting of SARS-CoV-2 peptides (T-cell epitopes and non-epitopes) obtained from IEDB. The average accuracy of 97.98% was recorded in repeated 5-fold cross validation. Its comparison with 05 robust machine learning classifiers and existing T-cell epitope prediction techniques, such as NetMHC and CTLpred, suggest the proposed work as a better model. The predicted epitopes from the current model could possess a high probability to act as potential peptide vaccine candidates subjected to in vitro and in vivo scientific assessments. The model developed would help scientific community working in vaccine development save time to screen the active T-cell epitope candidates of SARS-CoV-2 against the inactive ones.

摘要

由一种名为严重急性呼吸综合征冠状病毒2(SARS-CoV-2)的单链RNA病毒引起的2019冠状病毒病(COVID-19)疫情仍在持续,已造成全球大流行。迄今为止,疫苗接种已被证明是治疗COVID-19和抗击疫情最有效的方法。在所有疫苗类型中,基于表位的肽疫苗受到的关注较少,但在提高疫苗安全性和免疫原性方面具有巨大的未开发潜力。此类疫苗技术中使用的肽是根据目标病原体抗原蛋白(T细胞表位)的氨基酸序列化学合成的。通过湿实验室实验鉴定抗原蛋白非常困难、昂贵且耗时。我们在此提出一种集成机器学习(ML)模型,用于利用氨基酸的物理化学性质预测针对SARS-CoV-2的T细胞表位(也称为免疫相关决定簇或抗原决定簇)。为了训练该模型,我们从免疫表位数据库和分析资源(IEDB)库中检索了实验确定的SARS-CoV-2 T细胞表位。使用由从IEDB获得的SARS-CoV-2肽(T细胞表位和非表位)组成的测试集,所开发的模型分别实现了98.20%、0.991、0.994、0.971、0.982、0.990和0.981的准确率、AUC(ROC曲线下面积)、基尼系数、特异性、敏感性、F分数和精确率。在重复的5折交叉验证中,平均准确率为97.98%。将其与5种强大的机器学习分类器以及现有的T细胞表位预测技术(如NetMHC和CTLpred)进行比较,表明所提出的工作是一个更好的模型。从当前模型预测的表位很有可能作为潜在的肽疫苗候选物,有待进行体外和体内科学评估。所开发的模型将有助于从事疫苗研发的科学界节省时间,筛选出SARS-CoV-2的活性T细胞表位候选物与非活性候选物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f331/8617960/b70b21db4a2d/diagnostics-11-01990-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验