• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于反应预测的虚拟数据增强方法。

Virtual data augmentation method for reaction prediction.

机构信息

Artificial Intelligence Aided Drug Discovery Institute, College of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou, 310014, People's Republic of China.

College of Mathematics and Physics, Shanghai University of Electric Power, Shanghai, 201203, People's Republic of China.

出版信息

Sci Rep. 2022 Oct 12;12(1):17098. doi: 10.1038/s41598-022-21524-6.

DOI:10.1038/s41598-022-21524-6
PMID:36224331
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9556613/
Abstract

To improve the performance of data-driven reaction prediction models, we propose an intelligent strategy for predicting reaction products using available data and increasing the sample size using fake data augmentation. In this research, fake data sets were created and augmented with raw data for constructing virtual training models. Fake reaction datasets were created by replacing some functional groups, i.e., in the data analysis strategy, the fake data as compounds with modified functional groups to increase the amount of data for reaction prediction. This approach was tested on five different reactions, and the results show improvements over other relevant techniques with increased model predictivity. Furthermore, we evaluated this method in different models, confirming the generality of virtual data augmentation. In summary, virtual data augmentation can be used as an effective measure to solve the problem of insufficient data and significantly improve the performance of reaction prediction.

摘要

为了提高数据驱动的反应预测模型的性能,我们提出了一种智能策略,使用可用数据进行反应产物预测,并使用虚假数据扩充来增加样本量。在这项研究中,我们创建了虚假数据集,并对原始数据进行扩充,以构建虚拟训练模型。虚假反应数据集是通过替换某些官能团来创建的,即在数据分析策略中,将具有修饰官能团的虚假数据作为化合物,以增加反应预测的数据量。该方法在五个不同的反应中进行了测试,结果表明,与其他相关技术相比,该方法具有更高的模型预测能力。此外,我们在不同的模型中评估了该方法,证实了虚拟数据扩充的通用性。总之,虚拟数据扩充可以作为解决数据不足问题的有效措施,显著提高反应预测的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/6d27cbbf81e9/41598_2022_21524_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/09e0c429bcd6/41598_2022_21524_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/0642520bac1e/41598_2022_21524_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/10225718bc3a/41598_2022_21524_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/0f4cf9e45c56/41598_2022_21524_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/ec79fdcebce2/41598_2022_21524_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/6d27cbbf81e9/41598_2022_21524_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/09e0c429bcd6/41598_2022_21524_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/0642520bac1e/41598_2022_21524_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/10225718bc3a/41598_2022_21524_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/0f4cf9e45c56/41598_2022_21524_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/ec79fdcebce2/41598_2022_21524_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fd4/9556613/6d27cbbf81e9/41598_2022_21524_Fig6_HTML.jpg

相似文献

1
Virtual data augmentation method for reaction prediction.用于反应预测的虚拟数据增强方法。
Sci Rep. 2022 Oct 12;12(1):17098. doi: 10.1038/s41598-022-21524-6.
2
An Augmented Sample Selection Framework for Prediction of Anticancer Peptides.基于增强样本选择的抗癌肽预测框架
Molecules. 2023 Sep 18;28(18):6680. doi: 10.3390/molecules28186680.
3
Harnessing Data Augmentation and Normalization Preprocessing to Improve the Performance of Chemical Reaction Predictions of Data-Driven Model.利用数据增强和归一化预处理提高数据驱动模型化学反应预测的性能。
Polymers (Basel). 2023 May 8;15(9):2224. doi: 10.3390/polym15092224.
4
Improving autocoding performance of rare categories in injury classification: Is more training data or filtering the solution?提高伤害分类中罕见类别的自动编码性能:更多的训练数据还是过滤是解决方案?
Accid Anal Prev. 2018 Jan;110:115-127. doi: 10.1016/j.aap.2017.10.020. Epub 2017 Nov 8.
5
ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation.ACP-DA:利用数据增强改进抗癌肽的预测
Front Genet. 2021 Jun 30;12:698477. doi: 10.3389/fgene.2021.698477. eCollection 2021.
6
Ligand-based virtual screening and in silico design of new antimalarial compounds using nonstochastic and stochastic total and atom-type quadratic maps.基于配体的虚拟筛选以及使用非随机和随机全原子型及原子类型二次映射的新型抗疟化合物的计算机辅助设计。
J Chem Inf Model. 2005 Jul-Aug;45(4):1082-100. doi: 10.1021/ci050085t.
7
Combined and Machine Learning Approaches Toward Predicting Arrhythmic Risk in Post-infarction Patients.联合及机器学习方法用于预测心肌梗死后患者的心律失常风险
Front Physiol. 2021 Nov 8;12:745349. doi: 10.3389/fphys.2021.745349. eCollection 2021.
8
Deep learning models for detecting respiratory pathologies from raw lung auscultation sounds.用于从原始肺部听诊声音中检测呼吸道疾病的深度学习模型。
Soft comput. 2022;26(24):13405-13429. doi: 10.1007/s00500-022-07499-6. Epub 2022 Sep 26.
9
Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks.使用生成对抗网络(CycleGAN)进行数据增强以提高 CT 分割任务的泛化能力。
Sci Rep. 2019 Nov 15;9(1):16884. doi: 10.1038/s41598-019-52737-x.
10
Self-Supervised Action Representation Learning Based on Asymmetric Skeleton Data Augmentation.基于非对称骨骼数据增强的自监督动作表示学习。
Sensors (Basel). 2022 Nov 20;22(22):8989. doi: 10.3390/s22228989.

引用本文的文献

1
Automated facial nerve identification in microsurgery with an improved unet.使用改进的U-Net在显微手术中实现面神经自动识别
J Robot Surg. 2025 Jul 6;19(1):354. doi: 10.1007/s11701-025-02501-3.
2
MTGGF: A Metabolism Type-Aware Graph Generative Model for Molecular Metabolite Prediction.MTGGF:一种用于分子代谢物预测的代谢类型感知图生成模型。
Interdiscip Sci. 2025 Jan 6. doi: 10.1007/s12539-024-00681-4.
3
Machine learning-guided strategies for reaction conditions design and optimization.用于反应条件设计与优化的机器学习引导策略。

本文引用的文献

1
Artificial Intelligence in Chemistry: Current Trends and Future Directions.人工智能在化学领域的应用:当前趋势和未来方向。
J Chem Inf Model. 2021 Jul 26;61(7):3197-3212. doi: 10.1021/acs.jcim.1c00619. Epub 2021 Jul 15.
2
State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis.最先进的增强型自然语言处理转换器模型,用于直接和单步逆合成。
Nat Commun. 2020 Nov 4;11(1):5575. doi: 10.1038/s41467-020-19266-y.
3
Transfer learning enables the molecular transformer to predict regio- and stereoselective reactions on carbohydrates.
Beilstein J Org Chem. 2024 Oct 4;20:2476-2492. doi: 10.3762/bjoc.20.212. eCollection 2024.
4
AutoTemplate: enhancing chemical reaction datasets for machine learning applications in organic chemistry.自动模板:增强用于有机化学机器学习应用的化学反应数据集。
J Cheminform. 2024 Jun 27;16(1):74. doi: 10.1186/s13321-024-00869-2.
5
Effectiveness of Data Augmentation for Localization in WSNs Using Deep Learning for the Internet of Things.基于深度学习的物联网中用于无线传感器网络定位的数据增强有效性
Sensors (Basel). 2024 Jan 10;24(2):0. doi: 10.3390/s24020430.
迁移学习使分子转换器能够预测碳水化合物的区域和立体选择性反应。
Nat Commun. 2020 Sep 25;11(1):4874. doi: 10.1038/s41467-020-18671-7.
4
Transfer Learning for Drug Discovery.药物发现中的迁移学习。
J Med Chem. 2020 Aug 27;63(16):8683-8694. doi: 10.1021/acs.jmedchem.9b02147. Epub 2020 Jul 24.
5
Data Augmentation and Pretraining for Template-Based Retrosynthetic Prediction in Computer-Aided Synthesis Planning.基于模板的回溯合成预测的计算机辅助合成规划中的数据增强和预训练。
J Chem Inf Model. 2020 Jul 27;60(7):3398-3407. doi: 10.1021/acs.jcim.0c00403. Epub 2020 Jul 5.
6
Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain.数据集及其对制药领域计算机辅助合成规划工具发展的影响。
Chem Sci. 2019 Nov 5;11(1):154-168. doi: 10.1039/c9sc04944d. eCollection 2020 Jan 7.
7
Predicting Retrosynthetic Reactions Using Self-Corrected Transformer Neural Networks.使用自校正变换神经网络预测逆向合成反应。
J Chem Inf Model. 2020 Jan 27;60(1):47-55. doi: 10.1021/acs.jcim.9b00949. Epub 2019 Dec 24.
8
A Kernel Theory of Modern Data Augmentation.现代数据增强的核心理论。
Proc Mach Learn Res. 2019 Jun;97:1528-1537.
9
Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction.分子变压器:一种用于不确定性校准化学反应预测的模型。
ACS Cent Sci. 2019 Sep 25;5(9):1572-1583. doi: 10.1021/acscentsci.9b00576. Epub 2019 Aug 30.
10
Molecular Transformer unifies reaction prediction and retrosynthesis across pharma chemical space.分子变换统一了药物化学空间中的反应预测和反合成。
Chem Commun (Camb). 2019 Oct 8;55(81):12152-12155. doi: 10.1039/c9cc05122h.