Tsuji Nobuya, Sidorov Pavel, Zhu Chendan, Nagata Yuuya, Gimadiev Timur, Varnek Alexandre, List Benjamin
Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, 001-0021, Japan.
Max-Planck-Institut für Kohlenforschung, 45470, Mülheim an der Ruhr, Germany.
Angew Chem Int Ed Engl. 2023 Mar 6;62(11):e202218659. doi: 10.1002/anie.202218659. Epub 2023 Feb 6.
Catalyst optimization processes typically rely on inductive and qualitative assumptions of chemists based on screening data. While machine learning models using molecular properties or calculated 3D structures enable quantitative data evaluation, costly quantum chemical calculations are often required. In contrast, readily available binary fingerprint descriptors are time- and cost-efficient, but their predictive performance remains insufficient. Here, we describe a machine learning model based on fragment descriptors, which are fine-tuned for asymmetric catalysis and represent cyclic or polyaromatic hydrocarbons, enabling robust and efficient virtual screening. Using training data with only moderate selectivities, we designed theoretically and validated experimentally new catalysts showing higher selectivities in a challenging asymmetric tetrahydropyran synthesis.
催化剂优化过程通常依赖于化学家基于筛选数据的归纳和定性假设。虽然使用分子性质或计算得到的3D结构的机器学习模型能够进行定量数据评估,但通常需要昂贵的量子化学计算。相比之下,易于获得的二元指纹描述符既节省时间又成本效益高,但其预测性能仍然不足。在这里,我们描述了一种基于片段描述符的机器学习模型,该模型针对不对称催化进行了微调,代表环状或多环芳烃,能够实现强大而高效的虚拟筛选。我们使用选择性仅为中等的训练数据,从理论上设计并通过实验验证了在具有挑战性的不对称四氢吡喃合成中表现出更高选择性的新型催化剂。