Suppr超能文献

DECIMER.ai:一个用于科学出版物中光学化学结构自动识别、分割和识别的开放平台。

DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications.

作者信息

Rajan Kohulan, Brinkhaus Henning Otto, Agea M Isabel, Zielesny Achim, Steinbeck Christoph

机构信息

Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany.

Department of Informatics and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technicka 5, 166 28, Prague, Czech Republic.

出版信息

Nat Commun. 2023 Aug 19;14(1):5045. doi: 10.1038/s41467-023-40782-0.

Abstract

The number of publications describing chemical structures has increased steadily over the last decades. However, the majority of published chemical information is currently not available in machine-readable form in public databases. It remains a challenge to automate the process of information extraction in a way that requires less manual intervention - especially the mining of chemical structure depictions. As an open-source platform that leverages recent advancements in deep learning, computer vision, and natural language processing, DECIMER.ai (Deep lEarning for Chemical IMagE Recognition) strives to automatically segment, classify, and translate chemical structure depictions from the printed literature. The segmentation and classification tools are the only openly available packages of their kind, and the optical chemical structure recognition (OCSR) core application yields outstanding performance on all benchmark datasets. The source code, the trained models and the datasets developed in this work have been published under permissive licences. An instance of the DECIMER web application is available at https://decimer.ai .

摘要

在过去几十年中,描述化学结构的出版物数量稳步增加。然而,目前大多数已发表的化学信息在公共数据库中无法以机器可读的形式获取。以较少人工干预的方式自动化信息提取过程仍然是一项挑战,尤其是化学结构描绘的挖掘。作为一个利用深度学习、计算机视觉和自然语言处理最新进展的开源平台,DECIMER.ai(用于化学图像识别的深度学习)致力于从印刷文献中自动分割、分类和翻译化学结构描绘。分割和分类工具是同类中唯一公开可用的软件包,并且光学化学结构识别(OCSR)核心应用在所有基准数据集上都具有出色的性能。这项工作中开发的源代码、训练模型和数据集已根据宽松许可发布。DECIMER网络应用程序的实例可在https://decimer.ai上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6bd/10439916/3cc11bc1479c/41467_2023_40782_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验