Suppr超能文献

化学信息库:构思、早期经验以及在化学信息学未来中的作用。

Chemistry Informer Libraries: Conception, Early Experience, and Role in the Future of Cheminformatics.

机构信息

Chemistry Capabilities Accelerating Therapeutics, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States.

出版信息

Acc Chem Res. 2021 Apr 6;54(7):1586-1596. doi: 10.1021/acs.accounts.0c00760. Epub 2021 Mar 16.

Abstract

The synthetic chemistry literature traditionally reports the scope of new methods using simple, nonstandardized test molecules that have uncertain relevance in applied synthesis. In addition, published examples heavily favor positive reaction outcomes, and failure is rarely documented. In this environment, synthetic practitioners have inadequate information to know whether any given method is suitable for the task at hand. Moreover, the incomplete nature of published data makes it poorly suited for the creation of predictive reactivity models via machine learning approaches. In 2016, we reported the concept of chemistry informer libraries as standardized sets of medium- to high-complexity substrates with relevance to pharmaceutical synthesis as demonstrated using a multidimensional principle component analysis (PCA) comparison to the physicochemical properties of marketed drugs. We showed how informer libraries could be used to evaluate leading synthetic methods with the complete capture of success and failure and how this knowledge could lead to improved reaction conditions with a broader scope with respect to relevant applications. In this Account, we describe the progress made and lessons learned in subsequent studies using informer libraries to profile eight additional reaction classes. Examining broad trends across multiple types of bond disconnections against a standardized chemistry "measuring stick" has enabled comparisons of the relative potential of different methods for applications in complex synthesis and has identified opportunities for further development. Furthermore, the powerful combination of informer libraries and 1536-well-plate nanoscale reaction screening has allowed the parallel evaluation of scores of synthetic methods in the same experiment and as such illuminated an important role for informers as part of a larger data generation workflow for predictive reactivity modeling. Using informer libraries as problem-dense, strong filters has allowed broad sets of reaction conditions to be narrowed down to those that display the highest tolerance to complex substrates. These best conditions can then be used to survey broad swaths of substrate space using nanoscale chemistry approaches. Our experiences and those of our collaborators from several academic laboratories applying informer libraries in these contexts have helped us identify several areas for potential improvements to the approach that would increase their ease of use, utility in generating interpretable results, and resulting uptake by the broader community. As we continue to evolve the informer library concept, we believe it will play an ever-increasing role in the future of the democratization of high-throughput experimentation and data science-driven synthetic method development.

摘要

传统上,合成化学文献使用简单的、非标准化的测试分子来报告新方法的范围,这些测试分子在应用合成中具有不确定的相关性。此外,已发表的例子严重偏向于阳性反应结果,而失败的情况很少被记录下来。在这种情况下,合成从业人员没有足够的信息来确定任何给定的方法是否适用于手头的任务。此外,已发表数据的不完整性使得通过机器学习方法创建可预测的反应性模型的效果很差。2016 年,我们报告了化学信息库的概念,即将具有药物合成相关性的中等到高复杂度的标准底物集合作为信息库,使用多维主成分分析(PCA)与市售药物的物理化学性质进行比较来证明这一点。我们展示了如何使用信息库来评估主要的合成方法,全面记录成功和失败,并如何利用这些知识改善反应条件,使其在相关应用方面具有更广泛的范围。在本报告中,我们描述了在随后的研究中使用信息库来描述八个附加反应类别的进展和经验教训。使用标准化的化学“测量棒”对多种键断离进行广泛趋势的检查,使不同方法在复杂合成中的相对潜力进行了比较,并确定了进一步发展的机会。此外,信息库与 1536 孔板纳米级反应筛选的强大组合允许在同一个实验中对数十种合成方法进行平行评估,因此凸显了信息库作为更大数据生成工作流程的一部分在预测反应性建模中的重要作用。将信息库用作问题密集、强大的筛选器,可以将广泛的反应条件缩小到对复杂底物显示最高容忍度的条件。然后可以使用这些最佳条件通过纳米级化学方法来调查广泛的底物空间。我们的经验以及我们来自几个学术实验室的合作者在这些上下文中应用信息库的经验,帮助我们确定了该方法在增加易用性、在生成可解释结果方面的实用性以及在更广泛的社区中的应用方面的几个潜在改进领域。随着我们继续发展信息库的概念,我们相信它将在未来的高通量实验和数据科学驱动的合成方法开发的民主化中发挥越来越重要的作用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验