Suppr超能文献

iAtbP-Hyb-EnC:基于异质特征表示和遗传算法的集合学习模型对抗结核肽的预测。

iAtbP-Hyb-EnC: Prediction of antitubercular peptides via heterogeneous feature representation and genetic algorithm based ensemble learning model.

机构信息

Department of Computer Science, Abdul Wali Khan University, Mardan, KP, 23200, Pakistan.

Department of Information Technology, The University of Haripur, KP, Pakistan.

出版信息

Comput Biol Med. 2021 Oct;137:104778. doi: 10.1016/j.compbiomed.2021.104778. Epub 2021 Aug 25.

Abstract

Tuberculosis (TB) is a worldwide illness caused by the bacteria Mycobacterium tuberculosis. Owing to the high prevalence of multidrug-resistant tuberculosis, numerous traditional strategies for developing novel alternative therapies have been presented. The effectiveness and dependability of these procedures are not always consistent. Peptide-based therapy has recently been regarded as a preferable alternative due to its excellent selectivity in targeting specific cells without affecting the normal cells. However, due to the rapid growth of the peptide samples, predicting TB accurately has become a challenging task. To effectively identify antitubercular peptides, an intelligent and reliable prediction model is indispensable. An ensemble learning approach was used in this study to improve expected results by compensating for the shortcomings of individual classification algorithms. Initially, three distinct representation approaches were used to formulate the training samples: k-space amino acid composition, composite physiochemical properties, and one-hot encoding. The feature vectors of the applied feature extraction methods are then combined to generate a heterogeneous vector. Finally, utilizing individual and heterogeneous vectors, five distinct nature classification models were used to evaluate prediction rates. In addition, a genetic algorithm-based ensemble model was used to improve the suggested model's prediction and training capabilities. Using Training and independent datasets, the proposed ensemble model achieved an accuracy of 94.47% and 92.68%, respectively. It was observed that our proposed "iAtbP-Hyb-EnC" model outperformed and reported ~10% highest training accuracy than existing predictors. The "iAtbP-Hyb-EnC" model is suggested to be a reliable tool for scientists and might play a valuable role in academic research and drug discovery. The source code and all datasets are publicly available at https://github.com/Farman335/iAtbP-Hyb-EnC.

摘要

结核病(TB)是一种由结核分枝杆菌引起的全球性疾病。由于耐多药结核病的高患病率,已经提出了许多开发新型替代疗法的传统策略。这些方法的有效性和可靠性并不总是一致的。由于肽基疗法在靶向特定细胞而不影响正常细胞方面具有出色的选择性,因此最近被认为是一种较好的替代方法。然而,由于肽样品的快速增长,准确预测 TB 已成为一项具有挑战性的任务。为了有效地识别抗结核肽,需要一个智能且可靠的预测模型。本研究采用集成学习方法,通过补偿单个分类算法的缺点来提高预期结果。首先,使用三种不同的表示方法来构建训练样本:k-空间氨基酸组成、复合物理化学特性和独热编码。然后将应用特征提取方法的特征向量组合以生成异质向量。最后,使用个体和异质向量,使用五个不同的自然分类模型来评估预测率。此外,还使用基于遗传算法的集成模型来提高建议模型的预测和训练能力。使用训练集和独立数据集,所提出的集成模型分别实现了 94.47%和 92.68%的准确率。观察到,我们提出的“iAtbP-Hyb-EnC”模型表现优于现有预测器,并报告了~10%的最高训练准确率。建议将“iAtbP-Hyb-EnC”模型用作科学家的可靠工具,并且可能在学术研究和药物发现中发挥有价值的作用。源代码和所有数据集均可在 https://github.com/Farman335/iAtbP-Hyb-EnC 上获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验