Choe Junseok, Kim Hajung, Chok Yan Ting, Gim Mogan, Kang Jaewoo
Department of Computer Science, Korea University, Seoul, South Korea.
Department of Biomedical Engineering, Hankuk University of Foreign Studies, Yongin, South Korea.
J Cheminform. 2025 Aug 28;17(1):130. doi: 10.1186/s13321-025-01088-z.
Retrosynthesis-the process of deconstructing complex molecules into simpler, more accessible precursors-is a cornerstone of drug discovery and material design. While machine learning has improved single-step retrosynthesis prediction, generating complete multi-step retrosynthetic routes remains challenging. In this study, we explore the integration of single-step retrosynthesis models with various planning algorithms to improve multi-step retrosynthetic route generation. We expand the exploration space beyond previously limited settings by incorporating combinations of planning algorithms and single-step retrosynthesis models and diverse datasets, enabling a more comprehensive assessment of retrosynthetic strategies. We evaluated synthetic routes based on both solvability, the ability to generate a complete route, and route feasibility, which reflects their practical executability in the laboratory. Our findings show that the model combination with the highest solvability does not always produce the most feasible routes, underscoring the need for more nuanced evaluation. Through a systematic analysis of combinations of planning algorithms and single-step retrosynthesis models, their performance across different datasets, and various practical metrics, our study provides a more comprehensive evaluation of retrosynthetic planning strategies. These insights contribute to a better understanding of computational retrosynthesis and its alignment with real-world applicability.
逆合成——将复杂分子解构为更简单、更易获取的前体的过程——是药物发现和材料设计的基石。虽然机器学习已改进了单步逆合成预测,但生成完整的多步逆合成路线仍具有挑战性。在本研究中,我们探索将单步逆合成模型与各种规划算法相结合,以改进多步逆合成路线生成。我们通过纳入规划算法和单步逆合成模型的组合以及多样的数据集,将探索空间扩展到先前有限的设置之外,从而能够对逆合成策略进行更全面的评估。我们基于可解性(生成完整路线的能力)和路线可行性(反映其在实验室中的实际可执行性)对合成路线进行了评估。我们的研究结果表明,具有最高可解性的模型组合并不总是能产生最可行的路线,这突出了进行更细致评估的必要性。通过对规划算法和单步逆合成模型的组合、它们在不同数据集上的性能以及各种实际指标进行系统分析,我们的研究对逆合成规划策略进行了更全面的评估。这些见解有助于更好地理解计算逆合成及其与实际适用性的契合度。