Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, 2001 Longxiang Road, Shenzhen 518172, China.
School of Science and Engineering, The Chinese University of Hong Kong, 2001 Longxiang Road, Shenzhen 518172, China.
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae460.
Cancer is a severe illness that significantly threatens human life and health. Anticancer peptides (ACPs) represent a promising therapeutic strategy for combating cancer. In silico methods enable rapid and accurate identification of ACPs without extensive human and material resources. This study proposes a two-stage computational framework called ACP-CapsPred, which can accurately identify ACPs and characterize their functional activities across different cancer types. ACP-CapsPred integrates a protein language model with evolutionary information and physicochemical properties of peptides, constructing a comprehensive profile of peptides. ACP-CapsPred employs a next-generation neural network, specifically capsule networks, to construct predictive models. Experimental results demonstrate that ACP-CapsPred exhibits satisfactory predictive capabilities in both stages, reaching state-of-the-art performance. In the first stage, ACP-CapsPred achieves accuracies of 80.25% and 95.71%, as well as F1-scores of 79.86% and 95.90%, on benchmark datasets Set 1 and Set 2, respectively. In the second stage, tasked with characterizing the functional activities of ACPs across five selected cancer types, ACP-CapsPred attains an average accuracy of 90.75% and an F1-score of 91.38%. Furthermore, ACP-CapsPred demonstrates excellent interpretability, revealing regions and residues associated with anticancer activity. Consequently, ACP-CapsPred presents a promising solution to expedite the development of ACPs and offers a novel perspective for other biological sequence analyses.
癌症是一种严重威胁人类生命和健康的疾病。抗癌肽(ACPs)是一种有前途的治疗癌症的策略。计算方法可以在不需要大量人力和物力的情况下,快速准确地识别 ACPs。本研究提出了一种称为 ACP-CapsPred 的两阶段计算框架,可以准确识别 ACPs 并描述它们在不同癌症类型中的功能活性。ACP-CapsPred 将蛋白质语言模型与肽的进化信息和物理化学性质相结合,构建了肽的综合特征。ACP-CapsPred 使用下一代神经网络,即胶囊网络,构建预测模型。实验结果表明,ACP-CapsPred 在两个阶段都表现出令人满意的预测能力,达到了最先进的性能。在第一阶段,ACP-CapsPred 在基准数据集 Set 1 和 Set 2 上的准确率分别达到 80.25%和 95.71%,F1 得分为 79.86%和 95.90%。在第二阶段,ACP-CapsPred 的任务是描述五种选定癌症类型中 ACPs 的功能活性,平均准确率为 90.75%,F1 得分为 91.38%。此外,ACP-CapsPred 表现出出色的可解释性,揭示了与抗癌活性相关的区域和残基。因此,ACP-CapsPred 为加速 ACPs 的开发提供了一种有前途的解决方案,并为其他生物序列分析提供了新的视角。