• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

克莱尔:一种基于对比学习的化学反应酶委员会编号预测器。

CLAIRE: a contrastive learning-based predictor for EC number of chemical reactions.

作者信息

Zeng Zishuo, Guo Jin, Jin Jiao, Luo Xiaozhou

机构信息

Synceres Biosciences Co. Ltd., Shenzhen, 518100, China.

Shenzhen Key Laboratory for the Intelligent Microbial Manufacturing of Medicines, Key Laboratory of Quantitative Synthetic Biology, Center for Synthetic Biochemistry, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.

出版信息

J Cheminform. 2025 Jan 7;17(1):2. doi: 10.1186/s13321-024-00944-8.

DOI:10.1186/s13321-024-00944-8
PMID:39773344
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11707929/
Abstract

Predicting EC numbers for chemical reactions enables efficient enzymatic annotations for computer-aided synthesis planning. However, conventional machine learning approaches encounter challenges due to data scarcity and class imbalance. Here, we introduce CLAIRE (Contrastive Learning-based AnnotatIon for Reaction's EC), a novel framework leveraging contrastive learning, pre-trained language model-based reaction embeddings, and data augmentation to address these limitations. CLAIRE achieved notable performance improvements, demonstrating weighted average F1 scores of 0.861 and 0.911 on the testing set (n = 18,816) and an independent dataset (n = 1040) derived from yeast's metabolic model, respectively. Remarkably, CLAIRE significantly outperformed the state-of-the-art model by 3.65 folds and 1.18 folds, respectively. Its high accuracy positions CLAIRE as a promising tool for retrosynthesis planning, drug fate prediction, and synthetic biology applications. CLAIRE is freely available on GitHub ( https://github.com/zishuozeng/CLAIRE ).Scientific contributionThis work employed contrastive learning for predicting enzymatic reaction's EC numbers, overcoming the challenges in data scarcity and imbalance. The new model achieves the state-of-the-art performance and may facilitate the computer-aided synthesis planning.

摘要

预测化学反应的酶委员会(EC)编号有助于为计算机辅助合成规划进行高效的酶注释。然而,由于数据稀缺和类别不平衡,传统的机器学习方法面临挑战。在此,我们引入了CLAIRE(基于对比学习的反应EC注释),这是一个新颖的框架,它利用对比学习、基于预训练语言模型的反应嵌入和数据增强来解决这些限制。CLAIRE取得了显著的性能提升,在测试集(n = 18,816)和从酵母代谢模型衍生的独立数据集(n = 1040)上分别展示了0.861和0.911的加权平均F1分数。值得注意的是,CLAIRE分别比最先进的模型显著高出3.65倍和1.18倍。其高准确性使CLAIRE成为逆合成规划、药物命运预测和合成生物学应用的有前途的工具。CLAIRE可在GitHub(https://github.com/zishuozeng/CLAIRE)上免费获取。科学贡献这项工作采用对比学习来预测酶促反应的EC编号,克服了数据稀缺和不平衡方面的挑战。新模型实现了最先进的性能,并可能促进计算机辅助合成规划。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71fc/11707929/e84985d19616/13321_2024_944_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71fc/11707929/0679947d5a7c/13321_2024_944_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71fc/11707929/939e7a269438/13321_2024_944_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71fc/11707929/e84985d19616/13321_2024_944_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71fc/11707929/0679947d5a7c/13321_2024_944_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71fc/11707929/939e7a269438/13321_2024_944_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71fc/11707929/e84985d19616/13321_2024_944_Fig3_HTML.jpg

相似文献

1
CLAIRE: a contrastive learning-based predictor for EC number of chemical reactions.克莱尔:一种基于对比学习的化学反应酶委员会编号预测器。
J Cheminform. 2025 Jan 7;17(1):2. doi: 10.1186/s13321-024-00944-8.
2
CLAIRE: contrastive learning-based batch correction framework for better balance between batch mixing and preservation of cellular heterogeneity.CLAIRE:基于对比学习的批次校正框架,更好地平衡批次混合和保留细胞异质性。
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad099.
3
MediDRNet: Tackling category imbalance in diabetic retinopathy classification with dual-branch learning and prototypical contrastive learning.MediDRNet:使用双分支学习和原型对比学习解决糖尿病视网膜病变分类中的类别不平衡问题。
Comput Methods Programs Biomed. 2024 Aug;253:108230. doi: 10.1016/j.cmpb.2024.108230. Epub 2024 May 17.
4
Global-local aware Heterogeneous Graph Contrastive Learning for multifaceted association prediction in miRNA-gene-disease networks.基于全局-局部感知的异质图对比学习在 miRNA-基因-疾病网络中的多方面关联预测
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae443.
5
Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning.基于预训练蛋白质语言模型和对比学习的蛋白质-DNA 结合位点预测。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad488.
6
Reducing annotation burden in MR: A novel MR-contrast guided contrastive learning approach for image segmentation.减少磁共振成像中的标注负担:一种新的基于磁共振对比引导的对比学习方法用于图像分割。
Med Phys. 2024 Apr;51(4):2707-2720. doi: 10.1002/mp.16820. Epub 2023 Nov 13.
7
Enhancing Enzyme Commission Number Prediction With Contrastive Learning and Agent Attention.利用对比学习和智能体注意力增强酶委员会编号预测
Proteins. 2025 Sep;93(9):1507-1517. doi: 10.1002/prot.26822. Epub 2025 Apr 2.
8
A general model for predicting enzyme functions based on enzymatic reactions.一种基于酶促反应预测酶功能的通用模型。
J Cheminform. 2024 Mar 31;16(1):38. doi: 10.1186/s13321-024-00827-y.
9
Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation.基于伪标签自训练的局部对比损失的半监督医学图像分割。
Med Image Anal. 2023 Jul;87:102792. doi: 10.1016/j.media.2023.102792. Epub 2023 Mar 11.
10
MolFCL: predicting molecular properties through chemistry-guided contrastive and prompt learning.MolFCL:通过化学引导的对比学习和提示学习预测分子性质
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf061.

引用本文的文献

1
Graph-sequence enhanced transformer for template-free prediction of natural product biosynthesis.用于天然产物生物合成无模板预测的图序列增强变压器
Patterns (N Y). 2025 Apr 30;6(8):101259. doi: 10.1016/j.patter.2025.101259. eCollection 2025 Aug 8.

本文引用的文献

1
READRetro: natural product biosynthesis predicting with retrieval-augmented dual-view retrosynthesis.READRetro:基于检索增强的双视图回溯合成预测天然产物生物合成。
New Phytol. 2024 Sep;243(6):2512-2527. doi: 10.1111/nph.20012. Epub 2024 Jul 30.
2
Turnover number predictions for kinetically uncharacterized enzymes using machine and deep learning.使用机器学习和深度学习预测动力学特征未知的酶的周转率。
Nat Commun. 2023 Jul 12;14(1):4139. doi: 10.1038/s41467-023-39840-4.
3
Enzyme function prediction using contrastive learning.使用对比学习进行酶功能预测。
Science. 2023 Mar 31;379(6639):1358-1363. doi: 10.1126/science.adf2465. Epub 2023 Mar 30.
4
ProteInfer, deep neural networks for protein functional inference.蛋白推断,用于蛋白质功能推断的深度神经网络。
Elife. 2023 Feb 27;12:e80942. doi: 10.7554/eLife.80942.
5
A Novel Contrastive Self-Supervised Learning Framework for Solving Data Imbalance in Solder Joint Defect Detection.一种用于解决焊点缺陷检测中数据不平衡问题的新型对比自监督学习框架。
Entropy (Basel). 2023 Jan 31;25(2):268. doi: 10.3390/e25020268.
6
Machine Learning Yield Prediction from NiCOlit, a Small-Size Literature Data Set of Nickel Catalyzed C-O Couplings.机器学习从 NiCOlit 中预测产率,NiCOlit 是一个镍催化 C-O 偶联的小规模文献数据集。
J Am Chem Soc. 2022 Aug 17;144(32):14722-14730. doi: 10.1021/jacs.2c05302. Epub 2022 Aug 8.
7
Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP.深度学习驱动的生物合成途径导航用于天然产物的 BioNavi-NP。
Nat Commun. 2022 Jun 10;13(1):3342. doi: 10.1038/s41467-022-30970-9.
8
Reaction classification and yield prediction using the differential reaction fingerprint DRFP.使用微分反应指纹DRFP进行反应分类和产率预测。
Digit Discov. 2022 Jan 21;1(2):91-97. doi: 10.1039/d1dd00006c. eCollection 2022 Apr 11.
9
Biocatalysed synthesis planning using data-driven learning.基于数据驱动学习的生物催化合成规划。
Nat Commun. 2022 Feb 18;13(1):964. doi: 10.1038/s41467-022-28536-w.
10
Enzyme nomenclature and classification: the state of the art.酶命名法和分类:现状。
FEBS J. 2023 May;290(9):2214-2231. doi: 10.1111/febs.16274. Epub 2022 Jan 3.