Bukhari Syed Nisar Hussain, Elshiekh E, Abbas Mohamed
National Institute of Electronics and Information Technology (NIELIT), Srinagar, Jammu and Kashmir, India.
Department of Radiological Sciences, College of Applied Medical Sciences, King Khalid University, Abha, Saudi Arabia.
PeerJ Comput Sci. 2024 Apr 25;10:e1980. doi: 10.7717/peerj-cs.1980. eCollection 2024.
Majority of the existing SARS-CoV-2 vaccines work by presenting the whole pathogen in the attenuated form to immune system to invoke an immune response. On the other hand, the concept of a peptide based vaccine (PBV) is based on the identification and chemical synthesis of only immunodominant peptides known as T-cell epitopes (TCEs) to induce a specific immune response against a particular pathogen. However PBVs have received less attention despite holding huge untapped potential for boosting vaccine safety and immunogenicity. To identify these TCEs for designing PBV, wet-lab experiments are difficult, expensive, and time-consuming. Machine learning (ML) techniques can accurately predict TCEs, saving time and cost for speedy vaccine development. This work proposes novel hybrid ML techniques based on the physicochemical properties of peptides to predict SARS-CoV-2 TCEs. The proposed hybrid ML technique was evaluated using various ML model evaluation metrics and demonstrated promising results. The hybrid technique of decision tree classifier with chi-squared feature weighting technique and forward search optimal feature searching algorithm has been identified as the best model with an accuracy of 98.19%. Furthermore, K-fold cross-validation (KFCV) was performed to ensure that the model is reliable and the results indicate that the hybrid random forest model performs consistently well in terms of accuracy with respect to other hybrid approaches. The predicted TCEs are highly likely to serve as promising vaccine targets, subject to evaluations both and . This development could potentially save countless lives globally, prevent future epidemic-scale outbreaks, and reduce the risk of mutation escape.
现有的大多数严重急性呼吸综合征冠状病毒2(SARS-CoV-2)疫苗的工作原理是将减毒形式的完整病原体呈递给免疫系统以引发免疫反应。另一方面,基于肽的疫苗(PBV)的概念是基于仅对称为T细胞表位(TCE)的免疫显性肽进行鉴定和化学合成,以诱导针对特定病原体的特异性免疫反应。然而,尽管PBV在提高疫苗安全性和免疫原性方面具有巨大的未开发潜力,但受到的关注较少。为了识别这些用于设计PBV的TCE,湿实验室实验困难、昂贵且耗时。机器学习(ML)技术可以准确预测TCE,为快速的疫苗开发节省时间和成本。这项工作基于肽的物理化学性质提出了新颖的混合ML技术来预测SARS-CoV-2 TCE。使用各种ML模型评估指标对提出的混合ML技术进行了评估,并展示了有前景的结果。决策树分类器与卡方特征加权技术和前向搜索最优特征搜索算法的混合技术被确定为最佳模型,准确率为98.19%。此外,进行了K折交叉验证(KFCV)以确保模型可靠,结果表明混合随机森林模型在准确性方面相对于其他混合方法表现始终良好。预测的TCE很有可能成为有前景的疫苗靶点,但需经过[此处缺失两个评估方面的内容]的评估。这一进展有可能在全球挽救无数生命,预防未来的疫情规模爆发,并降低突变逃逸的风险。