Suppr超能文献

简单易用的反应格式。

Simple User-Friendly Reaction Format.

作者信息

Nippa David F, Müller Alex T, Atz Kenneth, Konrad David B, Grether Uwe, Martin Rainer E, Schneider Gisbert

机构信息

Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4070, Basel, Switzerland.

Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany.

出版信息

Mol Inform. 2025 Jan;44(1):e202400361. doi: 10.1002/minf.202400361.

Abstract

Utilizing the growing wealth of chemical reaction data can boost synthesis planning and increase success rates. Yet, the effectiveness of machine learning tools for retrosynthesis planning and forward reaction prediction relies on accessible, well-curated data presented in a structured format. Although some public and licensed reaction databases exist, they often lack essential information about reaction conditions. To address this issue and promote the principles of findable, accessible, interoperable, and reusable (FAIR) data reporting and sharing, we introduce the Simple User-Friendly Reaction Format (SURF). SURF standardizes the documentation of reaction data through a structured tabular format, requiring only a basic understanding of spreadsheets. This format enables chemists to record the synthesis of molecules in a format that is understandable by both humans and machines, which facilitates seamless sharing and integration directly into machine learning pipelines. SURF files are designed to be interoperable, easily imported into relational databases, and convertible into other formats. This complements existing initiatives like the Open Reaction Database (ORD) and Unified Data Model (UDM). At Roche, SURF plays a crucial role in democratizing FAIR reaction data sharing and expediting the chemical synthesis process.

摘要

利用日益丰富的化学反应数据可以促进合成规划并提高成功率。然而,机器学习工具在逆合成规划和正向反应预测方面的有效性依赖于以结构化格式呈现的可获取、精心整理的数据。尽管存在一些公共和许可的反应数据库,但它们往往缺乏有关反应条件的基本信息。为了解决这个问题并推广可查找、可获取、可互操作和可重复使用(FAIR)的数据报告和共享原则,我们引入了简单用户友好反应格式(SURF)。SURF通过结构化表格格式对反应数据的文档进行标准化,只需要对电子表格有基本的了解。这种格式使化学家能够以人类和机器都能理解的格式记录分子的合成,这有助于无缝共享并直接集成到机器学习管道中。SURF文件设计为可互操作的,易于导入关系数据库,并可转换为其他格式。这补充了诸如开放反应数据库(ORD)和统一数据模型(UDM)等现有计划。在罗氏公司,SURF在实现FAIR反应数据共享的民主化和加快化学合成过程方面发挥着关键作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e658/11755691/0c0bc408c4c2/MINF-44-e202400361-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验