National Institute of Electronics and Information Technology (NIELIT), Ministry of Electronics and Information Technology (MeitY), Govt. of India, Srinagar, J&K, 191132, India.
Department of Electronics and Communication Engineering, Kuwait College of Science and Technology (KCST), Doha Area, Kuwait.
Sci Rep. 2022 May 12;12(1):7810. doi: 10.1038/s41598-022-11731-6.
Zika fever is an infectious disease caused by the Zika virus (ZIKV). The disease is claiming millions of lives worldwide, primarily in developing countries. In addition to vector control strategies, the most effective way to prevent the spread of ZIKV infection is vaccination. There is no clinically approved vaccine to combat ZIKV infection and curb its pandemic. An epitope-based peptide vaccine (EBPV) is seen as a powerful alternative to conventional vaccinations because of its low production cost and short production time. Nonetheless, EBPVs have gotten less attention, despite the fact that they have a significant untapped potential for enhancing vaccine safety, immunogenicity, and cross-reactivity. Such a vaccine technology is based on target pathogen's selected antigenic peptides called T-cell epitopes (TCE), which are synthesized chemically based on their amino acid sequences. The identification of TCEs using wet-lab experimental approach is challenging, expensive, and time-consuming. Therefore in this study, we present computational model for the prediction of ZIKV TCEs. The model proposed is an ensemble of decision trees that utilizes the physicochemical properties of amino acids. In this way a large amount of time and efforts would be saved for quick vaccine development. The peptide sequences dataset for model training was retrieved from Virus Pathogen Database and Analysis Resource (ViPR) database. The sequences dataset consist of experimentally verified T-cell epitopes (TCEs) and non-TCEs. The model demonstrated promising results when evaluated on test dataset. The evaluation metrics namely, accuracy, AUC, sensitivity, specificity, Gini and Mathew's correlation coefficient (MCC) recorded values of 0.9789, 0.984, 0.981, 0.987, 0.974 and 0.948 respectively. The consistency and reliability of the model was assessed by carrying out the five (05)-fold cross-validation technique, and the mean accuracy of 0.97864 was reported. Finally, model was compared with standard machine learning (ML) algorithms and the proposed model outperformed all of them. The proposed model will aid in predicting novel and immunodominant TCEs of ZIKV. The predicted TCEs may have a high possibility of acting as prospective vaccine targets subjected to in-vivo and in-vitro scientific assessments, thereby saving lives worldwide, preventing future epidemic-scale outbreaks, and lowering the possibility of mutation escape.
寨卡热是由寨卡病毒(ZIKV)引起的传染病。该疾病正在全球范围内导致数百万人死亡,主要发生在发展中国家。除了病媒控制策略外,预防寨卡病毒感染传播的最有效方法是接种疫苗。目前尚无针对寨卡病毒感染的临床批准疫苗,也无法遏制其大流行。基于表位的肽疫苗(EBPV)被视为传统疫苗的有力替代品,因为其生产成本低,生产时间短。尽管如此,EBPV 受到的关注较少,尽管它们在提高疫苗安全性、免疫原性和交叉反应性方面具有巨大的未开发潜力。这种疫苗技术基于目标病原体的选定抗原肽,称为 T 细胞表位(TCE),这些肽是根据其氨基酸序列通过化学方法合成的。使用湿实验室实验方法来鉴定 TCE 具有挑战性、昂贵且耗时。因此,在这项研究中,我们提出了一种用于预测寨卡病毒 TCE 的计算模型。所提出的模型是决策树的集合,它利用了氨基酸的理化性质。通过这种方式,可以节省大量的时间和精力,以便快速开发疫苗。用于模型训练的肽序列数据集是从病毒病原体数据库和分析资源(ViPR)数据库中检索的。该序列数据集包含经过实验验证的 T 细胞表位(TCE)和非 TCE。该模型在测试数据集上进行评估时表现出了有希望的结果。评估指标,即准确性、AUC、灵敏度、特异性、基尼和马修相关系数(MCC)分别记录为 0.9789、0.984、0.981、0.987、0.974 和 0.948。通过进行五重(05)交叉验证技术评估模型的一致性和可靠性,报告的平均准确性为 0.97864。最后,将模型与标准机器学习(ML)算法进行了比较,提出的模型优于所有算法。该模型将有助于预测寨卡病毒的新型和免疫优势 TCE。预测的 TCE 很有可能成为有前途的疫苗靶标,需要进行体内和体外科学评估,从而挽救全球生命,防止未来的大规模爆发,并降低突变逃逸的可能性。