Hua Peng-Xiang, Huang Zhen, Xu Zhe-Yuan, Zhao Qiang, Ye Chen-Yang, Wang Yi-Feng, Xu Yun-He, Fu Yao, Ding Hu
School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230026, China.
Key Laboratory of Precision and Intelligent Chemistry, CAS Key Laboratory of Urban Pollutant Conversion, Anhui Province Key Laboratory of Biomass Clean Energy, Department of Chemistry, University of Science and Technology of China, Hefei, Anhui, 230026, China.
Commun Chem. 2025 Feb 10;8(1):42. doi: 10.1038/s42004-025-01434-0.
Reaction optimization plays an essential role in chemical research and industrial production. To explore a large reaction system, a practical issue is how to reduce the heavy experimental load for finding the high-yield conditions. In this paper, we present an efficient machine learning tool called "RS-Coreset", where the key idea is to take advantage of deep representation learning techniques to guide an interactive procedure for representing the full reaction space. Our proposed tool only uses small-scale data, say 2.5% to 5% of the instances, to predict the yields of the reaction space. We validate the performance on three public datasets and achieve state-of-the-art results. Moreover, we apply this tool to assist the realistic exploration of the Lewis base-boryl radicals enabled dechlorinative coupling reactions in our lab. The tool can help us to effectively predict the yields and even discover several feasible reaction combinations that were overlooked in previous articles.
反应优化在化学研究和工业生产中起着至关重要的作用。为了探索一个大型反应体系,一个实际问题是如何减少为找到高产率条件而带来的繁重实验负担。在本文中,我们提出了一种名为“RS-Coreset”的高效机器学习工具,其关键思想是利用深度表征学习技术来指导一个用于表示整个反应空间的交互式过程。我们提出的工具仅使用小规模数据,例如2.5%至5%的实例,来预测反应空间的产率。我们在三个公共数据集上验证了性能,并取得了领先的结果。此外,我们应用此工具辅助我们实验室中关于路易斯碱-硼基自由基引发的脱氯偶联反应的实际探索。该工具可以帮助我们有效预测产率,甚至发现一些在先前文章中被忽视的可行反应组合。