School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China.
Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Haidian District, Beijing, 100081, China.
BMC Biol. 2023 Oct 31;21(1):238. doi: 10.1186/s12915-023-01740-w.
Therapeutic peptides play an essential role in human physiology, treatment paradigms and bio-pharmacy. Several computational methods have been developed to identify the functions of therapeutic peptides based on binary classification and multi-label classification. However, these methods fail to explicitly exploit the relationship information among different functions, preventing the further improvement of the prediction performance. Besides, with the development of peptide detection technology, peptide functions will be more comprehensively discovered. Therefore, it is necessary to explore computational methods for detecting therapeutic peptide functions with limited labeled data.
In this study, a novel method called TPpred-LE based on Transformer framework was proposed for predicting therapeutic peptide multiple functions, which can explicitly extract the function correlation information by using label embedding methodology and exploit the specificity information based on function-specific classifiers. Besides, we incorporated the multi-label classifier retraining approach (MCRT) into TPpred-LE to detect the new therapeutic functions with limited labeled data. Experimental results demonstrate that TPpred-LE outperforms the other state-of-the-art methods, and TPpred-LE with MCRT is robust for the limited labeled data.
In summary, TPpred-LE is a function-specific classifier for accurate therapeutic peptide function prediction, demonstrating the importance of the relationship information for therapeutic peptide function prediction. MCRT is a simple but effective strategy to detect functions with limited labeled data.
治疗性肽在人类生理学、治疗范式和生物制药中发挥着重要作用。已经开发了几种计算方法,基于二分类和多标签分类来识别治疗性肽的功能。然而,这些方法未能明确利用不同功能之间的关系信息,从而阻止了预测性能的进一步提高。此外,随着肽检测技术的发展,将更全面地发现肽的功能。因此,有必要探索具有有限标记数据的治疗性肽功能检测的计算方法。
在这项研究中,提出了一种称为 TPpred-LE 的基于 Transformer 框架的新方法,用于预测治疗性肽的多种功能,该方法可以通过标签嵌入方法显式提取功能相关性信息,并基于功能特定分类器利用特异性信息。此外,我们将多标签分类器重新训练方法(MCRT)纳入 TPpred-LE 中,以用有限的标记数据检测新的治疗功能。实验结果表明,TPpred-LE 优于其他最先进的方法,并且具有 MCRT 的 TPpred-LE 对有限的标记数据具有鲁棒性。
总之,TPpred-LE 是一种用于准确预测治疗性肽功能的功能特定分类器,证明了关系信息对治疗性肽功能预测的重要性。MCRT 是一种简单但有效的策略,可用于检测具有有限标记数据的功能。