• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GEN:使用自学习生成式检查网络的高效SMILES资源探索器。

GEN: highly efficient SMILES explorer using autodidactic generative examination networks.

作者信息

van Deursen Ruud, Ertl Peter, Tetko Igor V, Godin Guillaume

机构信息

Firmenich SA, Research and Development, Rue des Jeunes 1, Les Acacias, 1227, Geneva, Switzerland.

Novartis Institutes for BioMedical Research, Novartis Campus, 4056, Basel, Switzerland.

出版信息

J Cheminform. 2020 Apr 10;12(1):22. doi: 10.1186/s13321-020-00425-8.

DOI:10.1186/s13321-020-00425-8
PMID:33430998
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7146994/
Abstract

Recurrent neural networks have been widely used to generate millions of de novo molecules in defined chemical spaces. Reported deep generative models are exclusively based on LSTM and/or GRU units and frequently trained using canonical SMILES. In this study, we introduce Generative Examination Networks (GEN) as a new approach to train deep generative networks for SMILES generation. In our GENs, we have used an architecture based on multiple concatenated bidirectional RNN units to enhance the validity of generated SMILES. GENs autonomously learn the target space in a few epochs and are stopped early using an independent online examination mechanism, measuring the quality of the generated set. Herein we have used online statistical quality control (SQC) on the percentage of valid molecular SMILES as examination measure to select the earliest available stable model weights. Very high levels of valid SMILES (95-98%) can be generated using multiple parallel encoding layers in combination with SMILES augmentation using unrestricted SMILES randomization. Our trained models combine an excellent novelty rate (85-90%) while generating SMILES with strong conservation of the property space (95-99%). In GENs, both the generative network and the examination mechanism are open to other architectures and quality criteria.

摘要

循环神经网络已被广泛用于在特定化学空间中生成数百万种从头合成的分子。已报道的深度生成模型仅基于长短期记忆(LSTM)和/或门控循环单元(GRU),并且经常使用标准的SMILES进行训练。在本研究中,我们引入了生成式检验网络(GEN)作为一种训练用于生成SMILES的深度生成网络的新方法。在我们的GEN中,我们使用了一种基于多个串联双向循环神经网络单元的架构来提高生成的SMILES的有效性。GEN能够在几个训练周期内自主学习目标空间,并使用独立的在线检验机制提前停止训练,该机制用于衡量生成集的质量。在此,我们使用在线统计质量控制(SQC)以有效分子SMILES的百分比作为检验指标,来选择最早可用的稳定模型权重。通过使用多个并行编码层并结合使用无限制的SMILES随机化进行SMILES增强,可以生成非常高比例的有效SMILES(95 - 98%)。我们训练的模型在生成SMILES时具有出色的新颖率(85 - 90%),同时在属性空间中具有很强的守恒性(95 - 99%)。在GEN中,生成网络和检验机制都可以采用其他架构和质量标准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/d6ffe85a8599/13321_2020_425_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/7cf063c24c0b/13321_2020_425_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/036b7b334ff8/13321_2020_425_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/4e116ed94b4e/13321_2020_425_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/e4a4ef0bb0e2/13321_2020_425_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/5b9ba25bedbe/13321_2020_425_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/d6ffe85a8599/13321_2020_425_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/7cf063c24c0b/13321_2020_425_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/036b7b334ff8/13321_2020_425_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/4e116ed94b4e/13321_2020_425_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/e4a4ef0bb0e2/13321_2020_425_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/5b9ba25bedbe/13321_2020_425_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2108/7146994/d6ffe85a8599/13321_2020_425_Fig6_HTML.jpg

相似文献

1
GEN: highly efficient SMILES explorer using autodidactic generative examination networks.GEN:使用自学习生成式检查网络的高效SMILES资源探索器。
J Cheminform. 2020 Apr 10;12(1):22. doi: 10.1186/s13321-020-00425-8.
2
Randomized SMILES strings improve the quality of molecular generative models.随机化的SMILES字符串提高了分子生成模型的质量。
J Cheminform. 2019 Nov 21;11(1):71. doi: 10.1186/s13321-019-0393-0.
3
UnCorrupt SMILES: a novel approach to de novo design.未腐败的SMILES:一种全新的从头设计方法。
J Cheminform. 2023 Feb 14;15(1):22. doi: 10.1186/s13321-023-00696-x.
4
Bidirectional Molecule Generation with Recurrent Neural Networks.双向分子生成的递归神经网络。
J Chem Inf Model. 2020 Mar 23;60(3):1175-1183. doi: 10.1021/acs.jcim.9b00943. Epub 2020 Jan 16.
5
SMILES-based deep generative scaffold decorator for de-novo drug design.用于从头药物设计的基于SMILES的深度生成支架修饰器。
J Cheminform. 2020 May 29;12(1):38. doi: 10.1186/s13321-020-00441-8.
6
Memory augmented recurrent neural networks for de-novo drug design.基于记忆增强的循环神经网络用于从头药物设计。
PLoS One. 2022 Jun 23;17(6):e0269461. doi: 10.1371/journal.pone.0269461. eCollection 2022.
7
Exploring the GDB-13 chemical space using deep generative models.使用深度生成模型探索GDB-13化学空间。
J Cheminform. 2019 Mar 12;11(1):20. doi: 10.1186/s13321-019-0341-z.
8
Generative Recurrent Networks for De Novo Drug Design.生成式循环网络用于从头药物设计。
Mol Inform. 2018 Jan;37(1-2). doi: 10.1002/minf.201700111. Epub 2017 Nov 2.
9
Improving Chemical Autoencoder Latent Space and Molecular Generation Diversity with Heteroencoders.用异构图编码器改进化学自动编码器潜在空间和分子生成多样性。
Biomolecules. 2018 Oct 30;8(4):131. doi: 10.3390/biom8040131.
10
Automated Generation of Novel Fragments Using Screening Data, a Dual SMILES Autoencoder, Transfer Learning and Syntax Correction.利用筛选数据、双 SMILES 自动编码器、迁移学习和语法修正自动化生成新片段。
J Chem Inf Model. 2021 Jun 28;61(6):2547-2559. doi: 10.1021/acs.jcim.0c01226. Epub 2021 May 24.

引用本文的文献

1
A systematic review of deep learning chemical language models in recent era.近期深度学习化学语言模型的系统综述。
J Cheminform. 2024 Nov 18;16(1):129. doi: 10.1186/s13321-024-00916-y.
2
A comprehensive review of artificial intelligence for pharmacology research.药理学研究中人工智能的全面综述。
Front Genet. 2024 Sep 3;15:1450529. doi: 10.3389/fgene.2024.1450529. eCollection 2024.
3
FOCUS on NOD2: Advancing IBD Drug Discovery with a User-Informed Machine Learning Framework.聚焦于NOD2:通过用户反馈的机器学习框架推进炎症性肠病药物研发

本文引用的文献

1
Transformer-CNN: Swiss knife for QSAR modeling and interpretation.Transformer-CNN:用于QSAR建模与解释的多功能工具
J Cheminform. 2020 Mar 18;12(1):17. doi: 10.1186/s13321-020-00423-w.
2
Randomized SMILES strings improve the quality of molecular generative models.随机化的SMILES字符串提高了分子生成模型的质量。
J Cheminform. 2019 Nov 21;11(1):71. doi: 10.1186/s13321-019-0393-0.
3
Focused Library Generator: case of Mdmx inhibitors.聚焦文库生成:Mdmx 抑制剂案例。
ACS Med Chem Lett. 2024 Jun 6;15(7):1057-1070. doi: 10.1021/acsmedchemlett.4c00148. eCollection 2024 Jul 11.
4
Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities.药物发现中的生成式人工智能:基本框架、最新进展、挑战与机遇
Front Pharmacol. 2024 Feb 7;15:1331062. doi: 10.3389/fphar.2024.1331062. eCollection 2024.
5
Status and Prospects of Research on Deep Learning-based Generation of Drug Molecules.基于深度学习的药物分子生成研究现状与展望
Curr Comput Aided Drug Des. 2025;21(3):257-269. doi: 10.2174/0115734099287389240126072433.
6
Mol-Zero-GAN: zero-shot adaptation of molecular generative adversarial network for specific protein targets.Mol-Zero-GAN:针对特定蛋白质靶点的分子生成对抗网络的零样本适应
RSC Adv. 2023 Dec 12;13(51):36048-36059. doi: 10.1039/d3ra03954d. eCollection 2023 Dec 8.
7
A Computationally Assisted Approach for Designing Wearable Biosensors toward Non-Invasive Personalized Molecular Analysis.一种用于设计可穿戴生物传感器的计算辅助方法,以实现非侵入性个性化分子分析。
Adv Mater. 2023 Sep;35(35):e2212161. doi: 10.1002/adma.202212161. Epub 2023 Jul 1.
8
Into the Unknown: How Computation Can Help Explore Uncharted Material Space.走进未知领域:计算如何帮助探索未知物质空间
J Am Chem Soc. 2022 Oct 19;144(41):18730-18743. doi: 10.1021/jacs.2c06833. Epub 2022 Oct 7.
9
DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach.深度图分子生成,一种用于生成具有理想性质分子的多目标计算策略:一种图卷积和强化学习方法。
J Cheminform. 2020 Sep 4;12(1):53. doi: 10.1186/s13321-020-00454-3.
10
From Big Data to Artificial Intelligence: chemoinformatics meets new challenges.从大数据到人工智能:化学信息学面临新挑战。
J Cheminform. 2020 Dec 18;12(1):74. doi: 10.1186/s13321-020-00475-y.
J Comput Aided Mol Des. 2020 Jul;34(7):769-782. doi: 10.1007/s10822-019-00242-8. Epub 2019 Nov 1.
4
GuacaMol: Benchmarking Models for de Novo Molecular Design.GuacaMol:从头设计分子的模型基准测试。
J Chem Inf Model. 2019 Mar 25;59(3):1096-1108. doi: 10.1021/acs.jcim.8b00839. Epub 2019 Mar 19.
5
Exploring the GDB-13 chemical space using deep generative models.使用深度生成模型探索GDB-13化学空间。
J Cheminform. 2019 Mar 12;11(1):20. doi: 10.1186/s13321-019-0341-z.
6
The next level in chemical space navigation: going far beyond enumerable compound libraries.化学空间导航的下一个层次:超越可枚举的化合物库。
Drug Discov Today. 2019 May;24(5):1148-1156. doi: 10.1016/j.drudis.2019.02.013. Epub 2019 Mar 7.
7
PubChem 2019 update: improved access to chemical data.PubChem 2019 年更新:改善化学数据获取。
Nucleic Acids Res. 2019 Jan 8;47(D1):D1102-D1109. doi: 10.1093/nar/gky1033.
8
Deep reinforcement learning for de novo drug design.基于深度强化学习的从头药物设计。
Sci Adv. 2018 Jul 25;4(7):eaap7885. doi: 10.1126/sciadv.aap7885. eCollection 2018 Jul.
9
Multi-objective de novo drug design with conditional graph generative model.基于条件图生成模型的多目标从头药物设计
J Cheminform. 2018 Jul 24;10(1):33. doi: 10.1186/s13321-018-0287-6.
10
Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules.使用数据驱动的分子连续表示法进行自动化学设计。
ACS Cent Sci. 2018 Feb 28;4(2):268-276. doi: 10.1021/acscentsci.7b00572. Epub 2018 Jan 12.