Luo Jiawei, Zhao Kejuan, Chen Junjie, Yang Caihua, Qu Fuchuan, Liu Yumeng, Jin Xiaopeng, Yan Ke, Zhang Yang, Liu Bin
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China.
School of Science, Harbin Institute of Technology, Shenzhen 518055, China.
Genomics Proteomics Bioinformatics. 2025 Jan 15;22(6). doi: 10.1093/gpbjnl/qzae084.
Functional peptides are short amino acid fragments that have a wide range of beneficial functions for living organisms. The majority of previous studies have focused on mono-functional peptides, but an increasing number of multi-functional peptides have been discovered. Although there have been enormous experimental efforts to assay multi-functional peptides, only a small portion of millions of known peptides has been explored. The development of effective and accurate techniques for identifying multi-functional peptides can facilitate their discovery and mechanistic understanding. In this study, we presented iMFP-LG, a method for multi-functional peptide identification based on protein language models (pLMs) and graph attention networks (GATs). Our comparative analyses demonstrated that iMFP-LG outperformed the state-of-the-art methods in identifying both multi-functional bioactive peptides and multi-functional therapeutic peptides. The interpretability of iMFP-LG was also illustrated by visualizing attention patterns in pLMs and GATs. Regarding the outstanding performance of iMFP-LG on the identification of multi-functional peptides, we employed iMFP-LG to screen novel peptides with both anti-microbial and anti-cancer functions from millions of known peptides in the UniRef90 database. As a result, eight candidate peptides were identified, among which one candidate was validated to process both anti-bacterial and anti-cancer properties through molecular structure alignment and biological experiments. We anticipate that iMFP-LG can assist in the discovery of multi-functional peptides and contribute to the advancement of peptide drug design.
功能性肽是短氨基酸片段,对生物体具有广泛的有益功能。以前的大多数研究都集中在单功能肽上,但越来越多的多功能肽已被发现。尽管在检测多功能肽方面已经进行了大量实验工作,但数百万已知肽中只有一小部分得到了探索。开发有效且准确的多功能肽鉴定技术可以促进其发现和机理理解。在本研究中,我们提出了iMFP-LG,一种基于蛋白质语言模型(pLMs)和图注意力网络(GATs)的多功能肽鉴定方法。我们的比较分析表明,iMFP-LG在鉴定多功能生物活性肽和多功能治疗肽方面均优于现有方法。通过可视化pLMs和GATs中的注意力模式,也说明了iMFP-LG的可解释性。鉴于iMFP-LG在多功能肽鉴定方面的出色表现,我们使用iMFP-LG从UniRef90数据库中的数百万已知肽中筛选具有抗菌和抗癌功能的新型肽。结果,鉴定出了8种候选肽,其中一种候选肽通过分子结构比对和生物学实验被验证具有抗菌和抗癌特性。我们预计iMFP-LG可以协助发现多功能肽,并有助于推进肽药物设计。