通过相似性内核实现零样本分子生成。

Zero shot molecular generation via similarity kernels.

作者信息

Elijošius Rokas, Zills Fabian, Batatia Ilyes, Norwood Sam Walton, Kovács Dávid Péter, Holm Christian, Csányi Gábor

机构信息

Engineering Laboratory, University of Cambridge, Cambridge, UK.

Institute for Computational Physics, University of Stuttgart, Stuttgart, Germany.

出版信息

Nat Commun. 2025 Jul 1;16(1):5991. doi: 10.1038/s41467-025-60963-3.

Abstract

Generative modelling aims to accelerate the discovery of novel chemicals by directly proposing structures with desirable properties. Recently, score-based, or diffusion, generative models have significantly outperformed previous approaches. Key to their success is the close relationship between the score and physical force, allowing the use of powerful equivariant neural networks. However, the behaviour of the learnt score is not yet well understood. Here, we analyse the score by training an energy-based diffusion model for molecular generation. We find that during the generation the score resembles a restorative potential initially and a quantum-mechanical force at the end, exhibiting special properties in between that enable the building of large molecules. Building upon these insights, we present Similarity-based Molecular Generation (SiMGen), a new zero-shot molecular generation method. SiMGen combines a time-dependent similarity kernel with local many-body descriptors to generate molecules without any further training. Our approach allows shape control via point cloud priors. Importantly, it can also act as guidance for existing trained models, enabling fragment-biased generation. We also release an interactive web tool, ZnDraw, for online SiMGen generation ( https://zndraw.icp.uni-stuttgart.de ).

摘要

生成式建模旨在通过直接提出具有理想性质的结构来加速新型化学物质的发现。最近,基于得分或扩散的生成式模型显著优于先前的方法。它们成功的关键在于得分与物理力之间的紧密关系,这使得能够使用强大的等变神经网络。然而,所学习到的得分的行为尚未得到很好的理解。在此,我们通过训练用于分子生成的基于能量的扩散模型来分析得分。我们发现,在生成过程中,得分最初类似于恢复势,最后类似于量子力学力,在这两者之间表现出特殊性质,从而能够构建大分子。基于这些见解,我们提出了基于相似性的分子生成(SiMGen),这是一种新的零样本分子生成方法。SiMGen将随时间变化的相似性核与局部多体描述符相结合,无需任何进一步训练即可生成分子。我们的方法允许通过点云先验进行形状控制。重要的是它还可以作为现有训练模型的指导,实现片段偏向生成。我们还发布了一个交互式网络工具ZnDraw,用于在线SiMGen生成(https://zndraw.icp.uni-stuttgart.de)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e4a8/12216838/8daa1e378751/41467_2025_60963_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索