Suppr超能文献

SCScore:从反应文献中学习到的合成复杂度。

SCScore: Synthetic Complexity Learned from a Reaction Corpus.

机构信息

Department of Chemical Engineering, Massachusetts Institute of Technology ; 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States.

出版信息

J Chem Inf Model. 2018 Feb 26;58(2):252-261. doi: 10.1021/acs.jcim.7b00622. Epub 2018 Jan 26.

Abstract

Several definitions of molecular complexity exist to facilitate prioritization of lead compounds, to identify diversity-inducing and complexifying reactions, and to guide retrosynthetic searches. In this work, we focus on synthetic complexity and reformalize its definition to correlate with the expected number of reaction steps required to produce a target molecule, with implicit knowledge about what compounds are reasonable starting materials. We train a neural network model on 12 million reactions from the Reaxys database to impose a pairwise inequality constraint enforcing the premise of this definition: that on average, the products of published chemical reactions should be more synthetically complex than their corresponding reactants. The learned metric (SCScore) exhibits highly desirable nonlinear behavior, particularly in recognizing increases in synthetic complexity throughout a number of linear synthetic routes.

摘要

目前存在几种分子复杂度的定义,以便于对先导化合物进行优先级排序,识别多样性诱导和复杂化反应,并指导回溯合成搜索。在这项工作中,我们关注合成复杂度,并重新定义其定义,使其与生产目标分子所需的反应步骤数的预期值相关联,同时隐含了对哪些化合物是合理起始材料的了解。我们在 Reaxys 数据库中的 1200 万反应上训练了一个神经网络模型,以施加一个强制该定义前提的二元不等式约束:即平均而言,已发表化学反应的产物应该比其相应的反应物具有更高的合成复杂度。所学习的度量(SCScore)表现出非常理想的非线性行为,特别是在识别多条线性合成路线中的合成复杂度增加方面。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验