Suppr超能文献

一种受神经符号编程启发的数据驱动型基团逆合成规划模型。

A data-driven group retrosynthesis planning model inspired by neurosymbolic programming.

作者信息

Zhang Xuefeng, Lin Haowei, Zhang Muhan, Zhou Yuan, Ma Jianzhu

机构信息

Institute for Artificial Intelligence, Peking University, Beijing, China.

Yau Mathematical Sciences Center, Tsinghua University, Beijing, China.

出版信息

Nat Commun. 2025 Jan 2;16(1):192. doi: 10.1038/s41467-024-55374-9.

Abstract

Deep generative models have garnered significant attention for their efficiency in drug discovery, yet the synthesis of proposed molecules remains a challenge. Retrosynthetic planning, a part of computer-assisted synthesis planning, addresses this challenge by recursively decomposing molecules using symbolic rules and machine-trained scoring functions. However, current methods often treat each molecule independently, missing the opportunity to utilize shared synthesis patterns and repeat pathways, which may contribute from known synthesis routes to newly emerging, similar molecules, a notable challenge with AI-generated small molecules. Our investigation reveals reusable synthesis patterns that augment the reaction template library, resulting in progressively decreasing marginal inference time as the algorithm processes more molecules. Nevertheless, expanding the library enlarges the search space, necessitating investigation into methods for effectively prediction of reactions in retrosynthesis search. Inspired by human learning, our algorithm, akin to neurosymbolic programming, builds upon commonly used multi-step concepts such as cascade and complementary reactions and can evolve from practical experiences, enhancing the prediction model for fundamental and compositional reaction templates. The evolutionary process involves wake, abstraction, and dreaming phases, alternatively extending the reaction template library and refining models for more efficient retrosynthesis. Our algorithm outperforms existing methods, discovers chemistry patterns, and significantly reduces inference time in retrosynthetic planning for a group of similar molecules, showcasing its potential in validating results from generative models.

摘要

深度生成模型因其在药物发现中的效率而备受关注,然而,所提出分子的合成仍然是一项挑战。逆合成规划作为计算机辅助合成规划的一部分,通过使用符号规则和机器学习评分函数对分子进行递归分解来应对这一挑战。然而,当前的方法往往独立处理每个分子,错失了利用共享合成模式和重复路径的机会,而这些模式和路径可能从已知合成路线延伸到新出现的类似分子,这是人工智能生成小分子时面临的一个显著挑战。我们的研究揭示了可重复使用的合成模式,这些模式扩充了反应模板库,随着算法处理更多分子,边际推理时间会逐渐减少。尽管如此,扩充库会扩大搜索空间,因此有必要研究在逆合成搜索中有效预测反应的方法。受人类学习启发,我们的算法类似于神经符号编程,基于级联和互补反应等常用的多步概念构建,并且可以从实际经验中演变,增强基本反应模板和组合反应模板的预测模型。进化过程包括唤醒、抽象和梦境阶段,交替扩展反应模板库并优化模型以实现更高效的逆合成。我们的算法优于现有方法,发现了化学模式,并显著减少了一组类似分子在逆合成规划中的推理时间,展示了其在验证生成模型结果方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4355/11695995/b1d3a3a998c2/41467_2024_55374_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验