Khabaz Hosein, Rahimi-Nasrabadi Mehdi, Keihan Amir Homayoun
Molecular Biology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
Faculty of Pharmacy, Baqiyatallah University of Medical Sciences, Tehran, Iran.
Front Mol Biosci. 2023 Sep 18;10:1238509. doi: 10.3389/fmolb.2023.1238509. eCollection 2023.
is a dangerous pathogen which causes a vast selection of infections. Antimicrobial peptides have been demonstrated as a new hope for developing antibiotic agents against multi-drug-resistant bacteria such as . Yet, most studies on developing classification tools for antimicrobial peptide activities do not focus on any specific species, and therefore, their applications are limited. Here, by using an up-to-date dataset, we have developed a hierarchical machine learning model for classifying peptides with antimicrobial activity against . The first-level model classifies peptides into AMPs and non-AMPs. The second-level model classifies AMPs into those active against and those not active against this species. Results from both classifiers demonstrate the effectiveness of the hierarchical approach. A comprehensive set of physicochemical and linguistic-based features has been used, and after feature selection steps, only some physicochemical properties were selected. The final model showed the F1-score of 0.80, recall of 0.86, balanced accuracy of 0.80, and specificity of 0.73 on the test set. The susceptibility to a single AMP is highly varied among different target species. Therefore, it cannot be concluded that AMP candidates suggested by AMP/non-AMP classifiers are able to show suitable activity against a specific species. Here, we addressed this issue by creating a hierarchical machine learning model which can be used in practical applications for extracting potential antimicrobial peptides against from peptide libraries.
是一种危险的病原体,可引发多种感染。抗菌肽已被证明是开发针对多药耐药细菌(如)的抗生素药物的新希望。然而,大多数关于开发抗菌肽活性分类工具的研究并未聚焦于任何特定物种,因此,它们的应用受到限制。在此,通过使用最新数据集,我们开发了一种分层机器学习模型,用于对具有针对的抗菌活性的肽进行分类。一级模型将肽分为抗菌肽和非抗菌肽。二级模型将抗菌肽分为对该物种有活性的和对该物种无活性的。两个分类器的结果都证明了分层方法的有效性。我们使用了一组全面的基于物理化学和语言的特征,经过特征选择步骤后,仅选择了一些物理化学性质。最终模型在测试集上的F1分数为0.80,召回率为0.86,平衡准确率为0.80,特异性为0.73。不同目标物种对单一抗菌肽的敏感性差异很大。因此,不能得出抗菌肽/非抗菌肽分类器推荐的抗菌肽候选物能够对特定物种显示出合适活性的结论。在此,我们通过创建一种分层机器学习模型解决了这个问题,该模型可用于从肽库中提取针对的潜在抗菌肽的实际应用中。