Suppr超能文献

基于层次化学图表示和多分辨率图变分自动编码器的基于片段的深度分子生成。

Fragment-based deep molecular generation using hierarchical chemical graph representation and multi-resolution graph variational autoencoder.

机构信息

Department of Computer Science and Engineering, University of Connecticut, Storrs, 06269, CT.

Current address: Center for Artificial Intelligence in Drug Discovery, School of Medicine, Case Western Reserve University, Cleveland, 44106, OH.

出版信息

Mol Inform. 2023 May;42(5):e2200215. doi: 10.1002/minf.202200215. Epub 2023 Mar 17.

Abstract

Graph generative models have recently emerged as an interesting approach to construct molecular structures atom-by-atom or fragment-by-fragment. In this study, we adopt the fragment-based strategy and decompose each input molecule into a set of small chemical fragments. In drug discovery, a few drug molecules are designed by replacing certain chemical substituents with their bioisosteres or alternative chemical moieties. This inspires us to group decomposed fragments into different fragment clusters according to their local structural environment around bond-breaking positions. In this way, an input structure can be transformed into an equivalent three-layer graph, in which individual atoms, decomposed fragments, or obtained fragment clusters act as graph nodes at each corresponding layer. We further implement a prototype model, named multi-resolution graph variational autoencoder (MRGVAE), to learn embeddings of constituted nodes at each layer in a fine-to-coarse order. Our decoder adopts a similar but conversely hierarchical structure. It first predicts the next possible fragment cluster, then samples an exact fragment structure out of the determined fragment cluster, and sequentially attaches it to the preceding chemical moiety. Our proposed approach demonstrates comparatively good performance in molecular evaluation metrics compared with several other graph-based molecular generative models. The introduction of the additional fragment cluster graph layer will hopefully increase the odds of assembling new chemical moieties absent in the original training set and enhance their structural diversity. We hope that our prototyping work will inspire more creative research to explore the possibility of incorporating different kinds of chemical domain knowledge into a similar multi-resolution neural network architecture.

摘要

图生成模型最近作为一种有趣的方法出现,用于逐个原子或逐个片段构建分子结构。在这项研究中,我们采用基于片段的策略,将每个输入分子分解成一组小的化学片段。在药物发现中,通过用生物等排体或替代化学基团替换某些化学取代基来设计少数药物分子。这启发我们根据键断裂位置周围的局部结构环境将分解的片段分组到不同的片段簇中。通过这种方式,输入结构可以转换为等效的三层图,其中单个原子、分解的片段或获得的片段簇作为每个相应层的图节点。我们进一步实现了一个原型模型,名为多分辨率图变分自动编码器(MRGVAE),以精细到粗糙的顺序学习每个层组成节点的嵌入。我们的解码器采用类似但相反的分层结构。它首先预测下一个可能的片段簇,然后从确定的片段簇中采样出确切的片段结构,并将其依次连接到前面的化学基团上。与其他几种基于图的分子生成模型相比,我们提出的方法在分子评估指标方面表现出了较好的性能。引入额外的片段簇图层有望增加组装原始训练集中不存在的新化学基团的可能性,并提高它们的结构多样性。我们希望我们的原型工作将激发更多有创意的研究,以探索将不同类型的化学领域知识纳入类似的多分辨率神经网络架构的可能性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验