Suppr超能文献

基于深度自动编码器递归神经网络与生成式拓扑映射的从头分子设计。

De Novo Molecular Design by Combining Deep Autoencoder Recurrent Neural Networks with Generative Topographic Mapping.

机构信息

Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France.

Faculty of Physics , M.V. Lomonosov Moscow State University , Leninskie Gory , Moscow 19991 , Russia.

出版信息

J Chem Inf Model. 2019 Mar 25;59(3):1182-1196. doi: 10.1021/acs.jcim.8b00751. Epub 2019 Mar 5.

Abstract

Here we show that Generative Topographic Mapping (GTM) can be used to explore the latent space of the SMILES-based autoencoders and generate focused molecular libraries of interest. We have built a sequence-to-sequence neural network with Bidirectional Long Short-Term Memory layers and trained it on the SMILES strings from ChEMBL23. Very high reconstruction rates of the test set molecules were achieved (>98%), which are comparable to the ones reported in related publications. Using GTM, we have visualized the autoencoder latent space on the two-dimensional topographic map. Targeted map zones can be used for generating novel molecular structures by sampling associated latent space points and decoding them to SMILES. The sampling method based on a genetic algorithm was introduced to optimize compound properties "on the fly". The generated focused molecular libraries were shown to contain original and a priori feasible compounds which, pending actual synthesis and testing, showed encouraging behavior in independent structure-based affinity estimation procedures (pharmacophore matching, docking).

摘要

在这里,我们展示了生成拓扑映射(GTM)可用于探索基于 SMILES 的自动编码器的潜在空间,并生成有针对性的分子文库。我们构建了一个具有双向长短期记忆层的序列到序列神经网络,并在 ChEMBL23 的 SMILES 字符串上对其进行了训练。测试集分子的重建率非常高(>98%),与相关文献中的报道相当。使用 GTM,我们在二维地形图上可视化了自动编码器的潜在空间。通过对相关潜在空间点进行采样和解码为 SMILES,可以在目标地图区域生成新的分子结构。引入了基于遗传算法的采样方法,以“实时”优化化合物性质。所生成的有针对性的分子文库包含原始的和先验可行的化合物,这些化合物在实际合成和测试之前,在独立的基于结构的亲和力估计程序(药效团匹配、对接)中表现出了令人鼓舞的行为。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验