Suppr超能文献

ACP-BC:基于双向长短期记忆和化学衍生信息融合特征的抗癌肽准确识别模型。

ACP-BC: A Model for Accurate Identification of Anticancer Peptides Based on Fusion Features of Bidirectional Long Short-Term Memory and Chemically Derived Information.

机构信息

Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China.

School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK.

出版信息

Int J Mol Sci. 2023 Oct 22;24(20):15447. doi: 10.3390/ijms242015447.

Abstract

Anticancer peptides (ACPs) have been proven to possess potent anticancer activities. Although computational methods have emerged for rapid ACPs identification, their accuracy still needs improvement. In this study, we propose a model called ACP-BC, a three-channel end-to-end model that utilizes various combinations of data augmentation techniques. In the first channel, features are extracted from the raw sequence using a bidirectional long short-term memory network. In the second channel, the entire sequence is converted into a chemical molecular formula, which is further simplified using Simplified Molecular Input Line Entry System notation to obtain deep abstract features through a bidirectional encoder representation transformer (BERT). In the third channel, we manually selected four effective features according to dipeptide composition, binary profile feature, k-mer sparse matrix, and pseudo amino acid composition. Notably, the application of chemical BERT in predicting ACPs is novel and successfully integrated into our model. To validate the performance of our model, we selected two benchmark datasets, ACPs740 and ACPs240. ACP-BC achieved prediction accuracy with 87% and 90% on these two datasets, respectively, representing improvements of 1.3% and 7% compared to existing state-of-the-art methods on these datasets. Therefore, systematic comparative experiments have shown that the ACP-BC can effectively identify anticancer peptides.

摘要

抗癌肽 (ACPs) 已被证明具有很强的抗癌活性。虽然已经出现了用于快速鉴定 ACPs 的计算方法,但它们的准确性仍有待提高。在这项研究中,我们提出了一种名为 ACP-BC 的模型,这是一种三通道端到端模型,利用了各种数据增强技术的组合。在第一个通道中,使用双向长短期记忆网络从原始序列中提取特征。在第二个通道中,将整个序列转换为化学分子公式,然后使用简化分子输入行进入系统符号进一步简化,通过双向编码器表示转换器 (BERT) 获得深层抽象特征。在第三个通道中,我们根据二肽组成、二进制轮廓特征、k-mer 稀疏矩阵和伪氨基酸组成手动选择了四个有效特征。值得注意的是,化学 BERT 在预测 ACPs 中的应用是新颖的,并成功地集成到我们的模型中。为了验证我们模型的性能,我们选择了两个基准数据集 ACPs740 和 ACPs240。ACP-BC 在这两个数据集上的预测准确率分别为 87%和 90%,与这两个数据集上现有的最先进方法相比,分别提高了 1.3%和 7%。因此,系统比较实验表明,ACP-BC 可以有效地识别抗癌肽。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a931/10607064/8244e4af9d83/ijms-24-15447-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验