Suppr超能文献

用于预测 T 细胞结核分枝杆菌表位的集成技术。

Ensemble Technique for Prediction of T-cell Mycobacterium tuberculosis Epitopes.

机构信息

Computer Science and Engineering Department, Thapar Institute of Engineering & Technology, Patiala, Punjab, 147004, India.

出版信息

Interdiscip Sci. 2019 Dec;11(4):611-627. doi: 10.1007/s12539-018-0309-0. Epub 2018 Nov 7.

Abstract

Development of an effective machine-learning model for T-cell Mycobacterium tuberculosis (M. tuberculosis) epitopes is beneficial for saving biologist's time and effort for identifying epitope in a targeted antigen. Existing NetMHC 2.2, NetMHC 2.3, NetMHC 3.0 and NetMHC 4.0 estimate binding capacity of peptide. This is still a challenge for those servers to predict whether a given peptide is M. tuberculosis epitope or non-epitope. One of the servers, CTLpred, works in this category but it is limited to peptide length of 9-mers. Therefore, in this work direct method of predicting M. tuberculosis epitope or non-epitope has been proposed which also overcomes the limitations of above servers. The proposed method is able to work with variable length epitopes having size even greater than 9-mers. Identification of T-cell or B-cell epitopes in the targeted antigen is the main goal in designing epitope-based vaccine, immune-diagnostic tests and antibody production. Therefore, it is important to introduce a reliable system which may help in the diagnosis of M. tuberculosis. In the present study, computational intelligence methods are used to classify T-cell M. tuberculosis epitopes. The caret feature selection approach is used to find out the set of relevant features. The ensemble model is designed by combining three models and is used to predict M. tuberculosis epitopes of variable length (7-40-mers). The proposed ensemble model achieves 82.0% accuracy, 0.89 specificity, 0.77 sensitivity with repeated k-fold cross-validation having average accuracy of 80.61%. The proposed ensemble model has been validated and compared with NetMHC 2.3, NetMHC 4.0 servers and CTLpred T-cell prediction server.

摘要

开发有效的机器学习模型来预测结核分枝杆菌(M. tuberculosis)T 细胞表位有利于节省生物学家鉴定靶抗原中表位的时间和精力。现有的 NetMHC 2.2、NetMHC 2.3、NetMHC 3.0 和 NetMHC 4.0 估计肽的结合能力。但是,这些服务器在预测给定肽是否为结核分枝杆菌表位或非表位方面仍然存在挑战。其中一个服务器 CTLpred 属于这一类,但它仅限于 9 肽长度。因此,在这项工作中,提出了一种直接预测结核分枝杆菌表位或非表位的方法,该方法还克服了上述服务器的局限性。该方法能够处理长度可变的表位,其大小甚至大于 9 肽。在设计基于表位的疫苗、免疫诊断测试和抗体生产时,鉴定靶抗原中的 T 细胞或 B 细胞表位是主要目标。因此,引入一个可靠的系统来帮助诊断结核分枝杆菌非常重要。在本研究中,使用计算智能方法来分类结核分枝杆菌 T 细胞表位。caret 特征选择方法用于找出相关特征集。通过结合三个模型设计了集成模型,用于预测可变长度(7-40 肽)的结核分枝杆菌表位。所提出的集成模型在重复 k 折交叉验证中实现了 82.0%的准确率、0.89 的特异性、0.77 的敏感性,平均准确率为 80.61%。该集成模型已经过验证,并与 NetMHC 2.3、NetMHC 4.0 服务器和 CTLpred T 细胞预测服务器进行了比较。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验