Suppr超能文献

在算法逆合成中生成多样性并确保完整性。

Generating diversity and securing completeness in algorithmic retrosynthesis.

作者信息

Mrugalla Florian, Franz Christopher, Alber Yannic, Mogk Georg, Villalba Martín, Mrziglod Thomas, Schewior Kevin

机构信息

Bayer AG, Leverkusen, Germany.

, Frankfurt, Germany.

出版信息

J Cheminform. 2025 May 13;17(1):72. doi: 10.1186/s13321-025-00981-x.

Abstract

Chemical synthesis planning has considerably benefited from advances in the field of machine learning. Neural networks can reliably and accurately predict reactions leading to a given, possibly complex, molecule. In this work we focus on algorithms for assembling such predictions to a full synthesis plan that, starting from simple building blocks, produces a given target molecule, a procedure known as retrosynthesis. Objective functions for this task are hard to define and context-specific. In order to generate a diverse set of synthesis plans for chemists to select from, we capture the concept of diversity in a novel chemical diversity score (CDS). Our experiments show that our algorithm outperforms the algorithm predominantly employed in this domain, Monte-Carlo Tree Search, with respect to diversity in terms of our score as well as time efficiency. SCIENTIFIC CONTRIBUTION: We adapt Depth-First Proof-Number Search (DFPN) (Please refer to https://github.com/Bayer-Group/bayer-retrosynthesis-search for the accompanying source code.) and its variants, which have been applied to retrosynthesis before, to produce a set of solutions, with an explicit focus on diversity. We also make progress on understanding DFPN in terms of completeness, i.e., the ability to find a solution whenever there exists one. DFPN is known to be incomplete, for which we provide a much cleaner example, but we also show that it is complete when reinforced with a threshold-controlling routine from the literature.

摘要

化学合成规划从机器学习领域的进展中受益匪浅。神经网络能够可靠且准确地预测通向给定的、可能复杂的分子的反应。在这项工作中,我们专注于将此类预测组合成完整合成计划的算法,该计划从简单的构建模块开始,生成给定的目标分子,这一过程被称为逆合成。此任务的目标函数难以定义且依赖上下文。为了生成一系列多样的合成计划供化学家选择,我们在一种新颖的化学多样性得分(CDS)中捕捉多样性的概念。我们的实验表明,就我们的得分所衡量的多样性以及时间效率而言,我们的算法优于该领域主要采用的算法——蒙特卡洛树搜索。科学贡献:我们改编了深度优先证明数搜索(DFPN)(相关源代码请参考https://github.com/Bayer-Group/bayer-retrosynthesis-search)及其变体,这些方法之前已应用于逆合成,以生成一组解决方案,特别关注多样性。我们还在理解DFPN的完备性方面取得了进展,即无论何时存在解决方案都能找到它的能力。已知DFPN是不完备的,我们为此提供了一个更清晰的例子,但我们也表明,当用文献中的阈值控制例程进行强化时,它是完备的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2410/12076909/769d202c94d2/13321_2025_981_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验