Hu Xiao, Chen Ziqi, Peng Bo, Adu-Ampratwum Daniel, Ning Xia
ArXiv. 2025 Mar 9:arXiv:2411.03320v4.
Accurate prediction of chemical reaction yields is crucial for optimizing organic synthesis, potentially reducing time and resources spent on experimentation. With the rise of artificial intelligence (AI), there is growing interest in leveraging AI-based methods to accelerate yield predictions without conducting in vitro experiments. We present log-RRIM, an innovative graph transformer-based framework designed for predicting chemical reaction yields. A key feature of log-RRIM is its integration of a cross-attention mechanism that focuses on the interplay between reagents and reaction centers. This design reflects a fundamental principle in chemical reactions: the crucial role of reagents in influencing bond-breaking and formation processes, which ultimately affect reaction yields. log-RRIM also implements a local-to-global reaction representation learning strategy. This approach initially captures detailed molecule-level information and then models and aggregates intermolecular interactions. Through this hierarchical process, log-RRIM effectively captures how different molecular fragments contribute to and influence the overall reaction yield, regardless of their size variations. log-RRIM shows superior performance in our experiments, especially for medium to high-yielding reactions, proving its reliability as a predictor. The framework's sophisticated modeling of reactant-reagent interactions and precise capture of molecular fragment contributions make it a valuable tool for reaction planning and optimization in chemical synthesis. The data and codes of log-RRIM are accessible through https://github.com/ninglab/Yield_log_RRIM.
准确预测化学反应产率对于优化有机合成至关重要,这有可能减少实验所花费的时间和资源。随着人工智能(AI)的兴起,人们越来越有兴趣利用基于AI的方法在不进行体外实验的情况下加速产率预测。我们提出了log-RRIM,这是一个基于图变换器的创新框架,旨在预测化学反应产率。log-RRIM的一个关键特性是其交叉注意力机制的集成,该机制专注于试剂与反应中心之间的相互作用。这种设计反映了化学反应中的一个基本原理:试剂在影响键断裂和形成过程中所起的关键作用,而这些过程最终会影响反应产率。log-RRIM还实施了一种从局部到全局的反应表示学习策略。这种方法首先捕获详细的分子水平信息,然后对分子间相互作用进行建模和汇总。通过这个分层过程,log-RRIM有效地捕捉了不同分子片段如何对整体反应产率做出贡献并产生影响,而不管它们的大小变化如何。log-RRIM在我们的实验中表现出卓越的性能,特别是对于中等到高产率的反应,证明了其作为预测器的可靠性。该框架对反应物-试剂相互作用的复杂建模以及对分子片段贡献的精确捕捉,使其成为化学合成中反应规划和优化的宝贵工具。log-RRIM的数据和代码可通过https://github.com/ninglab/Yield_log_RRIM获取。