利用对比学习和智能体注意力增强酶委员会编号预测

Zhao Wendi, Han Qiaoling, Yang Fan, Zhao Yue

School of Technology, Beijing Forestry University, Beijing, China.

Key Lab of State Forestry Administration for Forestry Equipment and Automation, Beijing, China.

Proteins. 2025 Sep;93(9):1507-1517. doi: 10.1002/prot.26822. Epub 2025 Apr 2.

The accurate prediction of enzyme function is crucial for elucidating disease mechanisms and identifying drug targets. Nevertheless, existing enzyme commission (EC) number prediction methods are limited by database coverage and the depth of sequence information mining, hindering the efficiency and precision of enzyme function annotation. Therefore, this study introduces ProteEC-CLA (Protein EC number prediction model with Contrastive Learning and Agent Attention). ProteEC-CLA utilizes contrastive learning to construct positive and negative sample pairs, which not only enhances sequence feature extraction but also improves the utilization of unlabeled data. This process helps the model learn the differences in sequence features, thereby enhancing its ability to predict enzyme function. Integrating the pre-trained protein language model ESM2, the model generates informative sequence embeddings for deep functional correlation analysis, significantly enhancing prediction accuracy. With the incorporation of the Agent Attention mechanism, ProteEC-CLA's ability to comprehensively capture local details and global features is enhanced, ensuring high-accuracy predictions on complex sequences. The results demonstrate that ProteEC-CLA performs exceptionally well on two independent and representative datasets. In the standard dataset, it achieves 98.92% accuracy at the EC4 level. In the more challenging clustered split dataset, ProteEC-CLA achieves 93.34% accuracy and an F1-score of 94.72%. With only enzyme sequences as input, ProteEC-CLA can accurately predict EC numbers up to the fourth level, significantly enhancing annotation efficiency and accuracy, which makes it a highly efficient and precise functional annotation tool for enzymology research and applications.

准确预测酶的功能对于阐明疾病机制和识别药物靶点至关重要。然而，现有的酶委员会（EC）编号预测方法受到数据库覆盖范围和序列信息挖掘深度的限制，阻碍了酶功能注释的效率和精度。因此，本研究引入了ProteEC-CLA（具有对比学习和智能体注意力的蛋白质EC编号预测模型）。ProteEC-CLA利用对比学习构建正样本和负样本对，这不仅增强了序列特征提取，还提高了未标记数据的利用率。这一过程有助于模型学习序列特征的差异，从而增强其预测酶功能的能力。该模型整合了预训练的蛋白质语言模型ESM2，生成用于深度功能相关性分析的信息丰富的序列嵌入，显著提高了预测准确性。通过引入智能体注意力机制，ProteEC-CLA全面捕捉局部细节和全局特征的能力得到增强，确保对复杂序列进行高精度预测。结果表明，ProteEC-CLA在两个独立且具有代表性的数据集上表现出色。在标准数据集中，它在EC4水平上的准确率达到98.92%。在更具挑战性的聚类分割数据集中，ProteEC-CLA的准确率达到93.34%，F1分数为94.72%。仅以酶序列作为输入，ProteEC-CLA就能准确预测到第四级的EC编号，显著提高注释效率和准确性，使其成为酶学研究和应用中一种高效且精确的功能注释工具。

相似文献

Enhancing Enzyme Commission Number Prediction With Contrastive Learning and Agent Attention.

Proteins. 2025 Sep;93(9):1507-1517. doi: 10.1002/prot.26822. Epub 2025 Apr 2.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

GOBeacon: An ensemble model for protein function prediction enhanced by contrastive learning.

Protein Sci. 2025 Jul;34(7):e70182. doi: 10.1002/pro.70182.

Short-Term Memory Impairment

iACP-DPNet: a dual-pooling causal dilated convolutional network for interpretable anticancer peptide identification.

Funct Integr Genomics. 2025 Jul 4;25(1):147. doi: 10.1007/s10142-025-01641-x.

Unveiling the evolution of policies for enhancing protein structure predictions: A comprehensive analysis.

Comput Biol Med. 2024 Sep;179:108815. doi: 10.1016/j.compbiomed.2024.108815. Epub 2024 Jul 11.

Advancing the accuracy of clathrin protein prediction through multi-source protein language models.

Sci Rep. 2025 Jul 8;15(1):24403. doi: 10.1038/s41598-025-08510-4.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

The Black Book of Psychotropic Dosing and Monitoring.

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Enhancing Enzyme Commission Number Prediction With Contrastive Learning and Agent Attention.

Proteins. 2025 Sep;93(9):1507-1517. doi: 10.1002/prot.26822. Epub 2025 Apr 2.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

GOBeacon: An ensemble model for protein function prediction enhanced by contrastive learning.

Protein Sci. 2025 Jul;34(7):e70182. doi: 10.1002/pro.70182.

Short-Term Memory Impairment

iACP-DPNet: a dual-pooling causal dilated convolutional network for interpretable anticancer peptide identification.

Funct Integr Genomics. 2025 Jul 4;25(1):147. doi: 10.1007/s10142-025-01641-x.

Unveiling the evolution of policies for enhancing protein structure predictions: A comprehensive analysis.

Comput Biol Med. 2024 Sep;179:108815. doi: 10.1016/j.compbiomed.2024.108815. Epub 2024 Jul 11.

Advancing the accuracy of clathrin protein prediction through multi-source protein language models.

Sci Rep. 2025 Jul 8;15(1):24403. doi: 10.1038/s41598-025-08510-4.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

The Black Book of Psychotropic Dosing and Monitoring.

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

Enhancing Enzyme Commission Number Prediction With Contrastive Learning and Agent Attention.

作者信息

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献