Suppr超能文献

两级扩散与多属性优化:一种生成具有理想属性分子的新方法。

Diffusing on Two Levels and Optimizing for Multiple Properties: A Novel Approach to Generating Molecules With Desirable Properties.

作者信息

Guo Siyuan, Guan Jihong, Zhou Shuigeng

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2050-2063. doi: 10.1109/TCBB.2024.3434461. Epub 2024 Dec 10.

Abstract

In the past decade, Artificial Intelligence (AI) driven drug design and discovery has been a hot research topic in the AI area, where an important branch is molecule generation by generative models, from GAN-based models and VAE-based models to the latest diffusion-based models. However, most existing models pursue mainly the basic properties like validity and uniqueness of the generated molecules, a few go further to explicitly optimize one single important molecular property (e.g. QED or PlogP), which makes most generated molecules little usefulness in practice. In this paper, we present a novel approach to generating molecules with desirable properties, which expands the diffusion model framework with multiple innovative designs. The novelty is two-fold. On the one hand, considering that the structures of molecules are complex and diverse, and molecular properties are usually determined by some substructures (e.g. pharmacophores), we propose to perform diffusion on two structural levels: molecules and molecular fragments respectively, with which a mixed Gaussian distribution is obtained for the reverse diffusion process. To get desirable molecular fragments, we develop a novel electronic effect based fragmentation method. On the other hand, we introduce two ways to explicitly optimize multiple molecular properties under the diffusion model framework. First, as potential drug molecules must be chemically valid, we optimize molecular validity by an energy-guidance function. Second, since potential drug molecules should be desirable in various properties, we employ a multi-objective mechanism to optimize multiple molecular properties simultaneously. Extensive experiments with two benchmark datasets QM9 and ZINC250 k show that the molecules generated by our proposed method have better validity, uniqueness, novelty, Fréchet ChemNet Distance (FCD), QED, and PlogP than those generated by current SOTA models.

摘要

在过去十年中,人工智能(AI)驱动的药物设计与发现一直是AI领域的一个热门研究课题,其中一个重要分支是通过生成模型生成分子,从基于GAN的模型、基于VAE的模型到最新的基于扩散的模型。然而,大多数现有模型主要追求生成分子的有效性和唯一性等基本属性,只有少数模型进一步明确优化单一重要分子属性(例如QED或PlogP),这使得大多数生成的分子在实际应用中几乎没有用处。在本文中,我们提出了一种生成具有理想属性分子的新方法,该方法通过多种创新设计扩展了扩散模型框架。其新颖性体现在两个方面。一方面,考虑到分子结构复杂多样,且分子属性通常由一些子结构(例如药效基团)决定,我们建议分别在分子和分子片段这两个结构层面上进行扩散,通过这种方式在反向扩散过程中获得混合高斯分布。为了得到理想的分子片段,我们开发了一种基于电子效应的新型碎片化方法。另一方面,我们引入了两种在扩散模型框架下明确优化多种分子属性的方法。首先,由于潜在药物分子必须在化学上有效,我们通过能量引导函数优化分子有效性。其次,由于潜在药物分子在各种属性上都应该是理想的,我们采用多目标机制同时优化多种分子属性。使用两个基准数据集QM9和ZINC250 k进行的大量实验表明,我们提出的方法生成的分子在有效性、唯一性、新颖性、弗雷歇化学网络距离(FCD)、QED和PlogP方面都优于当前的最优模型生成的分子。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验