Moon Sooyeon, Chatterjee Sourav, Seeberger Peter H, Gilmore Kerry
Department of Biomolecular Systems, Max-Planck-Institute of Colloids and Interfaces Am Mühlenberg 1 14476 Potsdam Germany
Freie Universität Berlin, Institute of Chemistry and Biochemistry Arnimallee 22 14195 Berlin Germany.
Chem Sci. 2020 Dec 26;12(8):2931-2939. doi: 10.1039/d0sc06222g.
Predicting the stereochemical outcome of chemical reactions is challenging in mechanistically ambiguous transformations. The stereoselectivity of glycosylation reactions is influenced by at least eleven factors across four chemical participants and temperature. A random forest algorithm was trained using a highly reproducible, concise dataset to accurately predict the stereoselective outcome of glycosylations. The steric and electronic contributions of all chemical reagents and solvents were quantified by quantum mechanical calculations. The trained model accurately predicts stereoselectivities for unseen nucleophiles, electrophiles, acid catalyst, and solvents across a wide temperature range (overall root mean square error 6.8%). All predictions were validated experimentally on a standardized microreactor platform. The model helped to identify novel ways to control glycosylation stereoselectivity and accurately predicts previously unknown means of stereocontrol. By quantifying the degree of influence of each variable, we begin to gain a better general understanding of the transformation, for example that environmental factors influence the stereoselectivity of glycosylations more than the coupling partners in this area of chemical space.
在机理模糊的转化反应中,预测化学反应的立体化学结果具有挑战性。糖基化反应的立体选择性受到四个化学参与物以及温度等至少十一个因素的影响。使用一个高度可重复、简洁的数据集训练了一种随机森林算法,以准确预测糖基化反应的立体选择性结果。通过量子力学计算对所有化学试剂和溶剂的空间和电子贡献进行了量化。训练后的模型能够在很宽的温度范围内准确预测未知亲核试剂、亲电试剂、酸催化剂和溶剂的立体选择性(总体均方根误差为6.8%)。所有预测都在标准化微反应器平台上进行了实验验证。该模型有助于识别控制糖基化立体选择性的新方法,并准确预测以前未知的立体控制手段。通过量化每个变量的影响程度,我们开始对这种转化反应有更好的总体理解,例如在这个化学空间领域中,环境因素对糖基化立体选择性的影响大于偶联伙伴。