King-Smith Emma, Berritt Simon, Bernier Louise, Hou Xinjun, Klug-McLeod Jacquelyn L, Mustakis Jason, Sach Neal W, Tucker Joseph W, Yang Qingyi, Howard Roger M, Lee Alpha A
Cavendish Laboratory, University of Cambridge, Cambridge, UK.
Pfizer Research and Development, Groton, CT, USA.
Nat Chem. 2024 Apr;16(4):633-643. doi: 10.1038/s41557-023-01393-w. Epub 2024 Jan 2.
High-throughput experimentation (HTE) has the potential to improve our understanding of organic chemistry by systematically interrogating reactivity across diverse chemical spaces. Notable bottlenecks include few publicly available large-scale datasets and the need for facile interpretation of these data's hidden chemical insights. Here we report the development of a high-throughput experimentation analyser, a robust and statistically rigorous framework, which is applicable to any HTE dataset regardless of size, scope or target reaction outcome, which yields interpretable correlations between starting material(s), reagents and outcomes. We improve the HTE data landscape with the disclosure of 39,000+ previously proprietary HTE reactions that cover a breadth of chemistry, including cross-coupling reactions and chiral salt resolutions. The high-throughput experimentation analyser was validated on cross-coupling and hydrogenation datasets, showcasing the elucidation of statistically significant hidden relationships between reaction components and outcomes, as well as highlighting areas of dataset bias and the specific reaction spaces that necessitate further investigation.
高通量实验(HTE)有潜力通过系统地探究不同化学空间中的反应性来增进我们对有机化学的理解。显著的瓶颈包括公开可用的大规模数据集较少,以及需要轻松解读这些数据中隐藏的化学见解。在此,我们报告了一种高通量实验分析仪的开发,这是一个稳健且统计严谨的框架,适用于任何HTE数据集,无论其大小、范围或目标反应结果如何,该框架能得出起始原料、试剂与结果之间可解释的相关性。我们通过公开39000多个以前专有的HTE反应改善了HTE数据状况,这些反应涵盖了广泛的化学领域,包括交叉偶联反应和手性盐拆分。高通量实验分析仪在交叉偶联和氢化数据集上得到了验证,展示了对反应组分与结果之间具有统计学意义的隐藏关系的阐明,同时突出了数据集偏差的领域以及需要进一步研究的特定反应空间。