Suppr超能文献

使用机器学习预测有机反应结果

Prediction of Organic Reaction Outcomes Using Machine Learning.

作者信息

Coley Connor W, Barzilay Regina, Jaakkola Tommi S, Green William H, Jensen Klavs F

机构信息

Department of Chemical Engineering and Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States.

出版信息

ACS Cent Sci. 2017 May 24;3(5):434-443. doi: 10.1021/acscentsci.7b00064. Epub 2017 Apr 18.

Abstract

Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions is that proposed reaction steps often fail when attempted in the laboratory, despite initially seeming viable. The true measure of success for any synthesis program is whether the predicted outcome matches what is observed experimentally. We report a model framework for anticipating reaction outcomes that combines the traditional use of reaction templates with the flexibility in pattern recognition afforded by neural networks. Using 15 000 experimental reaction records from granted United States patents, a model is trained to select the major (recorded) product by ranking a self-generated list of candidates where one candidate is known to be the major product. Candidate reactions are represented using a unique edit-based representation that emphasizes the fundamental transformation from reactants to products, rather than the constituent molecules' overall structures. In a 5-fold cross-validation, the trained model assigns the major product rank 1 in 71.8% of cases, rank ≤3 in 86.7% of cases, and rank ≤5 in 90.8% of cases.

摘要

计算机辅助合成设计已经存在了40多年,但逆合成规划软件一直难以得到广泛应用。开发高质量反应路径建议的一个关键挑战是,尽管最初看似可行,但所提出的反应步骤在实验室中尝试时往往会失败。任何合成程序成功的真正衡量标准是预测结果是否与实验观察结果相符。我们报告了一个预测反应结果的模型框架,该框架将反应模板的传统用法与神经网络在模式识别方面的灵活性相结合。使用来自美国授权专利的15000条实验反应记录,训练一个模型,通过对一个自行生成的候选列表进行排序来选择主要(记录的)产物,其中一个候选产物已知是主要产物。候选反应使用一种独特的基于编辑的表示法来表示,这种表示法强调从反应物到产物的基本转化,而不是组成分子的整体结构。在五折交叉验证中,训练后的模型在71.8%的情况下将主要产物排在第1位,在86.7%的情况下排在≤第3位,在90.8%的情况下排在≤第5位。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8bc/5445544/b77b3b8a2194/oc-2017-00064k_0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验