• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用筛选数据、双 SMILES 自动编码器、迁移学习和语法修正自动化生成新片段。

Automated Generation of Novel Fragments Using Screening Data, a Dual SMILES Autoencoder, Transfer Learning and Syntax Correction.

机构信息

Beatson Drug Discovery Unit, Cancer Research UK Beatson Institute, Garscube Estate, Switchback Road, Bearsden, Glasgow, G61 1BD, U.K.

BioAscent Discovery Ltd., Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, U.K.

出版信息

J Chem Inf Model. 2021 Jun 28;61(6):2547-2559. doi: 10.1021/acs.jcim.0c01226. Epub 2021 May 24.

DOI:10.1021/acs.jcim.0c01226
PMID:34029470
Abstract

Fragment-based hit identification (FBHI) allows proportionately greater coverage of chemical space using fewer molecules than traditional high-throughput screening approaches. However, effectively exploiting this advantage is highly dependent on the library design. Solubility, stability, chemical complexity, chemical/shape diversity, and synthetic tractability for fragment elaboration are all critical aspects, and molecule design remains a time-consuming task for computational and medicinal chemists. Artificial neural networks have attracted considerable attention in automated design applications and could also prove useful for fragment library design. Chemical autoencoders are neural networks consisting of encoder and decoder parts, which respectively compress and decompress molecular representations. The decoder is applied to samples drawn from the space of compressed representations to generate novel molecules that can be scored for properties of interest. Here, we report an autoencoder model using a recurrent neural network architecture, which was trained using 486,565 fragments curated from commercial sources, to simultaneously reconstruct both SMILES and chemical fingerprints. To explore its utility in fragment design, we applied transfer learning to the fingerprint decoder layers to train a classifier using 66 frequent hitter fragments identified from our screening campaigns. Using a particle swarm optimization sampling approach, we compare the performance of this "dual" model to an architecture encoding SMILES only. The dual model produced valid SMILES with improved features, considering a range of properties including aromatic ring counts, heavy atom count, synthetic accessibility, and a new fragment complexity score we term Feature Complexity (FeCo). Additionally, we demonstrate that generative performance is further enhanced by use of a simple syntax-correction procedure during training, in which invalid and undesirable SMILES are spiked into the training set. Finally, we used the syntax-corrected model to generate a library of novel candidate privileged fragments.

摘要

基于片段的命中鉴定 (FBHI) 允许使用比传统高通量筛选方法更少的分子来实现化学空间的比例更大的覆盖。然而,有效地利用这一优势高度依赖于文库设计。溶解度、稳定性、化学复杂性、化学/形状多样性以及片段修饰的合成可操作性都是关键方面,分子设计仍然是计算化学家和药物化学家耗时的任务。人工神经网络在自动化设计应用中引起了相当大的关注,也可能对片段文库设计有用。化学自动编码器是由编码器和解码器部分组成的神经网络,分别对分子表示进行压缩和解压缩。解码器应用于从压缩表示空间中抽取的样本,生成可用于计算感兴趣性质的新分子。在这里,我们报告了一种使用递归神经网络架构的自动编码器模型,该模型使用从商业来源中提取的 486,565 个片段进行训练,以同时重建 SMILES 和化学指纹。为了探索其在片段设计中的应用,我们将迁移学习应用于指纹解码器层,使用从我们的筛选活动中识别的 66 个高频命中片段来训练分类器。使用粒子群优化抽样方法,我们将这个“双”模型的性能与仅编码 SMILES 的架构进行了比较。该双模型生成的 SMILES 具有改进的特征,考虑了一系列特性,包括芳环计数、重原子计数、合成可及性和我们称之为特征复杂性 (FeCo) 的新片段复杂性评分。此外,我们证明通过在训练过程中使用简单的语法校正程序,可以进一步提高生成性能,其中将无效和不期望的 SMILES 混入训练集。最后,我们使用语法校正模型生成了一系列新的候选特权片段库。

相似文献

1
Automated Generation of Novel Fragments Using Screening Data, a Dual SMILES Autoencoder, Transfer Learning and Syntax Correction.利用筛选数据、双 SMILES 自动编码器、迁移学习和语法修正自动化生成新片段。
J Chem Inf Model. 2021 Jun 28;61(6):2547-2559. doi: 10.1021/acs.jcim.0c01226. Epub 2021 May 24.
2
Improving Chemical Autoencoder Latent Space and Molecular Generation Diversity with Heteroencoders.用异构图编码器改进化学自动编码器潜在空间和分子生成多样性。
Biomolecules. 2018 Oct 30;8(4):131. doi: 10.3390/biom8040131.
3
UnCorrupt SMILES: a novel approach to de novo design.未腐败的SMILES:一种全新的从头设计方法。
J Cheminform. 2023 Feb 14;15(1):22. doi: 10.1186/s13321-023-00696-x.
4
Generative Recurrent Networks for De Novo Drug Design.生成式循环网络用于从头药物设计。
Mol Inform. 2018 Jan;37(1-2). doi: 10.1002/minf.201700111. Epub 2017 Nov 2.
5
SMILES-based deep generative scaffold decorator for de-novo drug design.用于从头药物设计的基于SMILES的深度生成支架修饰器。
J Cheminform. 2020 May 29;12(1):38. doi: 10.1186/s13321-020-00441-8.
6
Adversarial Threshold Neural Computer for Molecular de Novo Design.对抗式阈神经网络计算机在分子从头设计中的应用
Mol Pharm. 2018 Oct 1;15(10):4386-4397. doi: 10.1021/acs.molpharmaceut.7b01137. Epub 2018 Mar 30.
7
Guidelines for Recurrent Neural Network Transfer Learning-Based Molecular Generation of Focused Libraries.基于递归神经网络迁移学习的聚焦文库分子生成指南。
J Chem Inf Model. 2020 Dec 28;60(12):5699-5713. doi: 10.1021/acs.jcim.0c00343. Epub 2020 Jul 24.
8
Training recurrent neural networks as generative neural networks for molecular structures: how does it impact drug discovery?将循环神经网络训练为生成式神经网络用于分子结构:它如何影响药物发现?
Expert Opin Drug Discov. 2022 Oct;17(10):1071-1079. doi: 10.1080/17460441.2023.2134340. Epub 2022 Oct 17.
9
De Novo Molecule Design by Translating from Reduced Graphs to SMILES.从头设计分子:从简化图到 SMILES 的转换。
J Chem Inf Model. 2019 Mar 25;59(3):1136-1146. doi: 10.1021/acs.jcim.8b00626. Epub 2018 Dec 21.
10
GEN: highly efficient SMILES explorer using autodidactic generative examination networks.GEN:使用自学习生成式检查网络的高效SMILES资源探索器。
J Cheminform. 2020 Apr 10;12(1):22. doi: 10.1186/s13321-020-00425-8.

引用本文的文献

1
ClickGen: Directed exploration of synthesizable chemical space via modular reactions and reinforcement learning.ClickGen:通过模块化反应和强化学习定向探索可综合化学空间。
Nat Commun. 2024 Nov 22;15(1):10127. doi: 10.1038/s41467-024-54456-y.
2
Reconstruction of lossless molecular representations from fingerprints.从指纹重建无损分子表示。
J Cheminform. 2023 Feb 23;15(1):26. doi: 10.1186/s13321-023-00693-0.
3
Implementation of an AI-assisted fragment-generator in an open-source platform.在一个开源平台中实现人工智能辅助片段生成器。
RSC Med Chem. 2022 Aug 15;13(10):1205-1211. doi: 10.1039/d2md00152g. eCollection 2022 Oct 19.
4
Into the Unknown: How Computation Can Help Explore Uncharted Material Space.走进未知领域:计算如何帮助探索未知物质空间
J Am Chem Soc. 2022 Oct 19;144(41):18730-18743. doi: 10.1021/jacs.2c06833. Epub 2022 Oct 7.
5
Targeting SARS-CoV-2 papain-like protease in the postvaccine era.在后疫苗时代针对 SARS-CoV-2 木瓜蛋白酶样蛋白酶。
Trends Pharmacol Sci. 2022 Nov;43(11):906-919. doi: 10.1016/j.tips.2022.08.008. Epub 2022 Aug 24.
6
Fragment Libraries Designed to Be Functionally Diverse Recover Protein Binding Information More Efficiently Than Standard Structurally Diverse Libraries.片段文库设计成具有功能多样性,比标准结构多样性文库更有效地回收蛋白质结合信息。
J Med Chem. 2022 Aug 25;65(16):11404-11413. doi: 10.1021/acs.jmedchem.2c01004. Epub 2022 Aug 12.
7
Fragment-based drug discovery-the importance of high-quality molecule libraries.基于片段的药物发现——高质量分子库的重要性。
Mol Oncol. 2022 Nov;16(21):3761-3777. doi: 10.1002/1878-0261.13277. Epub 2022 Jul 10.