AMPFinder:一种基于序列衍生信息识别抗菌肽及其功能的计算模型。
AMPFinder: A computational model to identify antimicrobial peptides and their functions based on sequence-derived information.
机构信息
The Affiliated Changzhou No 2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China; School of Computer Science and Artificial Intelligence Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China.
School of Computer Science and Artificial Intelligence Aliyun School of Big Data, School of Software, Changzhou University, Changzhou, 213164, China.
出版信息
Anal Biochem. 2023 Jul 15;673:115196. doi: 10.1016/j.ab.2023.115196. Epub 2023 May 24.
Antimicrobial peptides (AMPs) called host defense peptides have existed among all classes of life with 5-100 amino acids generally and can kill mycobacteria, envelop viruses, bacteria, fungi, cancerous cells and so on. Owing to the non-drug resistance of AMP, it has been a wonderful agent to find novel therapies. Therefore, it is urgent to identify AMPs and predict their function in a high-throughput way. In this paper, we propose a cascaded computational model to identify AMPs and their functional type based on sequence-derived and life language embedding, called AMPFinder. Compared with other state-of-the-art methods, AMPFinder obtains higher performance both on AMP identification and AMP function prediction. AMPFinder shows better performance with improvement of F1-score (1.45%-6.13%), MCC (2.92%-12.86%) and AUC (5.13%-8.56%) and AP (9.20%-21.07%) on an independent test dataset. And AMPFinder achieve lower bias of R on a public dataset by 10-fold cross-validation with an improvement of (18.82%-19.46%). The comparison with other state-of-the-art methods shows that AMP can accurately identify AMP and its function types. The datasets, source code and user-friendly application are available at https://github.com/abcair/AMPFinder.
抗菌肽(AMPs)又称宿主防御肽,存在于所有生命形式中,通常由 5-100 个氨基酸组成,可杀死分枝杆菌、包膜病毒、细菌、真菌、癌细胞等。由于 AMP 无耐药性,因此成为寻找新型疗法的理想药物。因此,迫切需要以高通量的方式识别 AMP 并预测其功能。在本文中,我们提出了一种基于序列衍生和生命语言嵌入的级联计算模型来识别 AMP 和它们的功能类型,称为 AMPFinder。与其他最先进的方法相比,AMPFinder 在 AMP 识别和 AMP 功能预测方面都具有更高的性能。在独立测试数据集上,AMPFinder 在 F1 分数(1.45%-6.13%)、MCC(2.92%-12.86%)和 AUC(5.13%-8.56%)以及 AP(9.20%-21.07%)方面都有更好的表现。在公共数据集的 10 倍交叉验证中,通过降低 R 的偏差(18.82%-19.46%),进一步提高了 AMPFinder 的性能。与其他最先进的方法的比较表明,AMPFinder 可以准确识别 AMP 和其功能类型。数据集、源代码和用户友好的应用程序可在 https://github.com/abcair/AMPFinder 上获取。