• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于递归神经网络迁移学习的聚焦文库分子生成指南。

Guidelines for Recurrent Neural Network Transfer Learning-Based Molecular Generation of Focused Libraries.

机构信息

School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom.

Computational Sciences, GlaxoSmithKline, Gunnels Wood Road, Stevenage, Herts SG1 2NY, United Kingdom.

出版信息

J Chem Inf Model. 2020 Dec 28;60(12):5699-5713. doi: 10.1021/acs.jcim.0c00343. Epub 2020 Jul 24.

DOI:10.1021/acs.jcim.0c00343
PMID:32659085
Abstract

Deep learning approaches have become popular in recent years in the field of molecular design. While a variety of different methods are available, it is still a challenge to assess and compare their performance. A particularly promising approach for automated drug design is to use recurrent neural networks (RNNs) as SMILES generators and train them with the learning procedure called "transfer learning". This involves first training the initial model on a large generic data set of molecules to learn the general syntax of SMILES, followed by fine-tuning on a smaller set of molecules, coming from, e.g., a lead optimization program. To create a well-performing transfer learning application which can be automated, it is important to understand how the size of the second data set affects the training process. In addition, extensive postfiltering using similarity metrics of the molecules generated after transfer learning should be avoided, as it can introduce new biases toward the selection of drug candidates. Here, we present results from the application of a gated recurrent unit cell (GRU)-RNN to transfer learning on data sets of varying sizes and complexity. Analysis of the results has allowed us to provide some general guidelines for transfer learning. In particular, we show that data set sizes containing at least 190 molecules are needed for effective GRU-RNN-based molecular generation using transfer learning. The methods presented here should be applicable generally to the benchmarking of other deep learning methodologies for molecule generation.

摘要

深度学习方法在近年来的分子设计领域变得越来越流行。虽然有多种不同的方法可用,但评估和比较它们的性能仍然是一项挑战。一种特别有前途的自动化药物设计方法是使用递归神经网络 (RNN) 作为 SMILES 生成器,并通过称为“迁移学习”的学习过程对其进行训练。这涉及到首先在分子的大型通用数据集上训练初始模型,以学习 SMILES 的一般语法,然后在较小的分子数据集上进行微调,这些分子来自例如先导优化程序。为了创建可以自动化的性能良好的迁移学习应用程序,了解第二个数据集的大小如何影响训练过程非常重要。此外,应避免使用分子生成后的相似性度量进行广泛的后过滤,因为它可能会导致对候选药物选择的新偏见。在这里,我们展示了门控循环单元 (GRU)-RNN 在不同大小和复杂程度的数据集中应用迁移学习的结果。对结果的分析使我们能够为迁移学习提供一些一般准则。特别是,我们表明,需要至少包含 190 个分子的数据集大小才能有效地进行基于 GRU-RNN 的分子生成的迁移学习。这里提出的方法应该可以普遍适用于对其他分子生成的深度学习方法进行基准测试。

相似文献

1
Guidelines for Recurrent Neural Network Transfer Learning-Based Molecular Generation of Focused Libraries.基于递归神经网络迁移学习的聚焦文库分子生成指南。
J Chem Inf Model. 2020 Dec 28;60(12):5699-5713. doi: 10.1021/acs.jcim.0c00343. Epub 2020 Jul 24.
2
Memory augmented recurrent neural networks for de-novo drug design.基于记忆增强的循环神经网络用于从头药物设计。
PLoS One. 2022 Jun 23;17(6):e0269461. doi: 10.1371/journal.pone.0269461. eCollection 2022.
3
De Novo Molecular Design of Caspase-6 Inhibitors by a GRU-Based Recurrent Neural Network Combined with a Transfer Learning Approach.基于门控循环单元的循环神经网络结合迁移学习方法从头设计半胱天冬酶-6抑制剂
Pharmaceuticals (Basel). 2021 Nov 30;14(12):1249. doi: 10.3390/ph14121249.
4
Automated Generation of Novel Fragments Using Screening Data, a Dual SMILES Autoencoder, Transfer Learning and Syntax Correction.利用筛选数据、双 SMILES 自动编码器、迁移学习和语法修正自动化生成新片段。
J Chem Inf Model. 2021 Jun 28;61(6):2547-2559. doi: 10.1021/acs.jcim.0c01226. Epub 2021 May 24.
5
Training recurrent neural networks as generative neural networks for molecular structures: how does it impact drug discovery?将循环神经网络训练为生成式神经网络用于分子结构:它如何影响药物发现?
Expert Opin Drug Discov. 2022 Oct;17(10):1071-1079. doi: 10.1080/17460441.2023.2134340. Epub 2022 Oct 17.
6
Generative Recurrent Networks for De Novo Drug Design.生成式循环网络用于从头药物设计。
Mol Inform. 2018 Jan;37(1-2). doi: 10.1002/minf.201700111. Epub 2017 Nov 2.
7
De Novo Molecule Design by Translating from Reduced Graphs to SMILES.从头设计分子:从简化图到 SMILES 的转换。
J Chem Inf Model. 2019 Mar 25;59(3):1136-1146. doi: 10.1021/acs.jcim.8b00626. Epub 2018 Dec 21.
8
Generation of focused drug molecule library using recurrent neural network.利用递归神经网络生成聚焦药物分子文库。
J Mol Model. 2023 Nov 6;29(12):361. doi: 10.1007/s00894-023-05772-5.
9
Generative machine learning for de novo drug discovery: A systematic review.生成式机器学习在从头药物发现中的应用:系统评价。
Comput Biol Med. 2022 Jun;145:105403. doi: 10.1016/j.compbiomed.2022.105403. Epub 2022 Mar 13.
10
Bidirectional Molecule Generation with Recurrent Neural Networks.双向分子生成的递归神经网络。
J Chem Inf Model. 2020 Mar 23;60(3):1175-1183. doi: 10.1021/acs.jcim.9b00943. Epub 2020 Jan 16.

引用本文的文献

1
A systematic review of deep learning chemical language models in recent era.近期深度学习化学语言模型的系统综述。
J Cheminform. 2024 Nov 18;16(1):129. doi: 10.1186/s13321-024-00916-y.
2
PromptSMILES: prompting for scaffold decoration and fragment linking in chemical language models.PromptSMILES:在化学语言模型中促进支架修饰和片段连接。
J Cheminform. 2024 Jul 4;16(1):77. doi: 10.1186/s13321-024-00866-5.
3
A pharmacophore-guided deep learning approach for bioactive molecular generation.基于药效团的深度学习方法用于生物活性分子生成。
Nat Commun. 2023 Oct 6;14(1):6234. doi: 10.1038/s41467-023-41454-9.
4
Data-Driven Methods for Accelerating Polymer Design.加速聚合物设计的数据驱动方法。
ACS Polym Au. 2021 Dec 28;2(1):8-26. doi: 10.1021/acspolymersau.1c00035. eCollection 2022 Feb 9.
5
3CLpro inhibitors: DEL-based molecular generation.3C样蛋白酶(3CLpro)抑制剂:基于DEL的分子生成
Front Pharmacol. 2022 Dec 7;13:1085665. doi: 10.3389/fphar.2022.1085665. eCollection 2022.
6
Optimizing interactions to protein binding sites by integrating docking-scoring strategies into generative AI methods.通过将对接评分策略整合到生成式人工智能方法中,优化与蛋白质结合位点的相互作用。
Front Chem. 2022 Oct 19;10:1012507. doi: 10.3389/fchem.2022.1012507. eCollection 2022.
7
Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation.增强爬山算法提高了基于语言的从头分子生成的强化学习效率。
J Cheminform. 2022 Oct 3;14(1):68. doi: 10.1186/s13321-022-00646-z.
8
De Novo Molecular Design of Caspase-6 Inhibitors by a GRU-Based Recurrent Neural Network Combined with a Transfer Learning Approach.基于门控循环单元的循环神经网络结合迁移学习方法从头设计半胱天冬酶-6抑制剂
Pharmaceuticals (Basel). 2021 Nov 30;14(12):1249. doi: 10.3390/ph14121249.
9
Artificial Intelligence in Compound Design.化合物设计中的人工智能
Methods Mol Biol. 2022;2390:349-382. doi: 10.1007/978-1-0716-1787-8_15.
10
Has Artificial Intelligence Impacted Drug Discovery?人工智能是否影响了药物发现?
Methods Mol Biol. 2022;2390:153-176. doi: 10.1007/978-1-0716-1787-8_6.