Suppr超能文献

使用变压器模型进行详尽的局部化学空间探索。

Exhaustive local chemical space exploration using a transformer model.

作者信息

Tibo Alessandro, He Jiazhen, Janet Jon Paul, Nittinger Eva, Engkvist Ola

机构信息

Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.

Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D AstraZeneca, Gothenburg, Sweden.

出版信息

Nat Commun. 2024 Aug 25;15(1):7315. doi: 10.1038/s41467-024-51672-4.

Abstract

How many near-neighbors does a molecule have? This fundamental question in chemistry is crucial for molecular optimization problems under the similarity principle assumption. Generative models can sample molecules from a vast chemical space but lack explicit knowledge about molecular similarity. Therefore, these models need guidance from reinforcement learning to sample a relevant similar chemical space. However, they still miss a mechanism to measure the coverage of a specific region of the chemical space. To overcome these limitations, a source-target molecular transformer model, regularized via a similarity kernel function, is proposed. Trained on a largest dataset of ≥200 billion molecular pairs, the model enforces a direct relationship between generating a target molecule and its similarity to a source molecule. Results indicate that the regularization term significantly improves the correlation between generation probability and molecular similarity, enabling exhaustive exploration of molecule near-neighborhoods.

摘要

一个分子有多少近邻?化学中的这个基本问题对于相似性原理假设下的分子优化问题至关重要。生成模型可以从广阔的化学空间中采样分子,但缺乏关于分子相似性的明确知识。因此,这些模型需要强化学习的指导来采样相关的相似化学空间。然而,它们仍然缺少一种机制来测量化学空间特定区域的覆盖范围。为了克服这些限制,提出了一种通过相似性核函数进行正则化的源 - 目标分子变压器模型。该模型在一个≥2000亿对分子的最大数据集上进行训练,在生成目标分子与其与源分子的相似性之间建立了直接关系。结果表明,正则化项显著提高了生成概率与分子相似性之间的相关性,能够对分子近邻进行详尽探索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5c/11345417/f447e144ca04/41467_2024_51672_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验