Götz Julian, Richards Euan, Stepek Iain A, Takahashi Yu, Huang Yi-Lin, Bertschi Louis, Rubi Bertran, Bode Jeffrey W
Laboratory for Organic Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland.
Molecular and Biomolecular Analysis Service (MoBiAS), Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland.
Sci Adv. 2025 May 30;11(22):eadw6047. doi: 10.1126/sciadv.adw6047. Epub 2025 May 28.
Efficient drug discovery depends on reliable synthetic access to candidate molecules, but emerging machine learning approaches to predicting reaction outcomes are hampered by poor availability of high-quality data. Here, we demonstrate an on-demand synthesis platform based on a three-component reaction that delivers drug-like molecules. Miniaturization and automation enable the execution and analysis of 50,000 distinct reactions on a 3-microliter scale from 193 different substrates, producing the largest public reaction outcome dataset. With machine learning, we accurately predict the result of unknown reactions and analyze the impact of dataset size on model training, both enabling accurate outcome predictions even for unseen reactants and providing a sufficiently large dataset to critically evaluate emerging machine learning approaches to chemical reactivity.
高效的药物发现依赖于对候选分子的可靠合成途径,但新兴的预测反应结果的机器学习方法因高质量数据的可用性差而受到阻碍。在这里,我们展示了一个基于三组分反应的按需合成平台,该平台能提供类药物分子。小型化和自动化使得能够在3微升规模上从193种不同的底物执行和分析50000个不同的反应,从而产生了最大的公开反应结果数据集。通过机器学习,我们准确地预测了未知反应的结果,并分析了数据集大小对模型训练的影响,这两者既能够对未见反应物准确预测反应结果,又能提供一个足够大的数据集来严格评估新兴的化学反应性机器学习方法。