Suppr超能文献

LOGICS:学习用于设计全新化学结构的最优生成分布。

LOGICS: Learning optimal generative distribution for designing de novo chemical structures.

作者信息

Bae Bongsung, Bae Haelee, Nam Hojung

机构信息

School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Buk-Gu, Gwangju, 61005, Republic of Korea.

AI Graduate School, Gwangju Institute of Science and Technology (GIST), Buk-Gu, Gwangju, 61005, Republic of Korea.

出版信息

J Cheminform. 2023 Sep 7;15(1):77. doi: 10.1186/s13321-023-00747-3.

Abstract

In recent years, the field of computational drug design has made significant strides in the development of artificial intelligence (AI) models for the generation of de novo chemical compounds with desired properties and biological activities, such as enhanced binding affinity to target proteins. These high-affinity compounds have the potential to be developed into more potent therapeutics for a broad spectrum of diseases. Due to the lack of data required for the training of deep generative models, however, some of these approaches have fine-tuned their molecular generators using data obtained from a separate predictor. While these studies show that generative models can produce structures with the desired target properties, it remains unclear whether the diversity of the generated structures and the span of their chemical space align with the distribution of the intended target molecules. In this study, we present a novel generative framework, LOGICS, a framework for Learning Optimal Generative distribution Iteratively for designing target-focused Chemical Structures. We address the exploration-exploitation dilemma, which weighs the choice between exploring new options and exploiting current knowledge. To tackle this issue, we incorporate experience memory and employ a layered tournament selection approach to refine the fine-tuning process. The proposed method was applied to the binding affinity optimization of two target proteins of different protein classes, κ-opioid receptors, and PIK3CA, and the quality and the distribution of the generative molecules were evaluated. The results showed that LOGICS outperforms competing state-of-the-art models and generates more diverse de novo chemical structures with optimized properties. The source code is available at the GitHub repository ( https://github.com/GIST-CSBL/LOGICS ).

摘要

近年来,计算药物设计领域在开发人工智能(AI)模型以生成具有所需性质和生物活性的全新化学化合物方面取得了重大进展,例如增强与靶蛋白的结合亲和力。这些高亲和力化合物有可能被开发成针对广泛疾病的更有效治疗药物。然而,由于深度生成模型训练所需的数据不足,其中一些方法使用从单独的预测器获得的数据对其分子生成器进行了微调。虽然这些研究表明生成模型可以产生具有所需目标性质的结构,但生成结构的多样性及其化学空间的跨度是否与预期目标分子的分布一致仍不清楚。在本研究中,我们提出了一种新颖的生成框架LOGICS,即一种用于迭代学习最优生成分布以设计聚焦目标的化学结构的框架。我们解决了探索-利用困境,即在探索新选项和利用现有知识之间进行权衡。为了解决这个问题,我们引入了经验记忆,并采用分层锦标赛选择方法来优化微调过程。将所提出的方法应用于两种不同蛋白质类别的靶蛋白κ-阿片受体和PIK3CA的结合亲和力优化,并评估了生成分子的质量和分布。结果表明,LOGICS优于竞争的现有最先进模型,并生成了具有优化性质的更多样化的全新化学结构。源代码可在GitHub存储库(https://github.com/GIST-CSBL/LOGICS)中获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2bba/10483765/109c00b6630e/13321_2023_747_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验