Morehead Alex, Cheng Jianlin
Department of Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO, 65211, USA.
Commun Chem. 2024 Jul 3;7(1):150. doi: 10.1038/s42004-024-01233-z.
Generative deep learning methods have recently been proposed for generating 3D molecules using equivariant graph neural networks (GNNs) within a denoising diffusion framework. However, such methods are unable to learn important geometric properties of 3D molecules, as they adopt molecule-agnostic and non-geometric GNNs as their 3D graph denoising networks, which notably hinders their ability to generate valid large 3D molecules. In this work, we address these gaps by introducing the Geometry-Complete Diffusion Model (GCDM) for 3D molecule generation, which outperforms existing 3D molecular diffusion models by significant margins across conditional and unconditional settings for the QM9 dataset and the larger GEOM-Drugs dataset, respectively. Importantly, we demonstrate that GCDM's generative denoising process enables the model to generate a significant proportion of valid and energetically-stable large molecules at the scale of GEOM-Drugs, whereas previous methods fail to do so with the features they learn. Additionally, we show that extensions of GCDM can not only effectively design 3D molecules for specific protein pockets but can be repurposed to consistently optimize the geometry and chemical composition of existing 3D molecules for molecular stability and property specificity, demonstrating new versatility of molecular diffusion models. Code and data are freely available on GitHub .
最近有人提出了生成式深度学习方法,用于在去噪扩散框架内使用等变图神经网络(GNN)生成3D分子。然而,这些方法无法学习3D分子的重要几何特性,因为它们采用与分子无关的非几何GNN作为其3D图去噪网络,这显著阻碍了它们生成有效的大型3D分子的能力。在这项工作中,我们通过引入用于3D分子生成的几何完全扩散模型(GCDM)来弥补这些差距,该模型在QM9数据集和更大的GEOM-Drugs数据集的条件和无条件设置下,分别比现有的3D分子扩散模型有显著优势。重要的是,我们证明了GCDM的生成去噪过程使模型能够在GEOM-Drugs规模上生成相当比例的有效且能量稳定的大分子,而以前的方法利用它们所学习的特征无法做到这一点。此外,我们表明GCDM的扩展不仅可以有效地为特定的蛋白质口袋设计3D分子,还可以重新用于持续优化现有3D分子的几何结构和化学成分,以实现分子稳定性和性质特异性,展示了分子扩散模型的新通用性。代码和数据可在GitHub上免费获取。