Suppr超能文献

CLiDE Pro:CLiDE的最新一代产品,一款用于光学化学结构识别的工具。

CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition.

作者信息

Valko Aniko T, Johnson A Peter

机构信息

Keymodule Ltd., Hobberley Lodge, Hobberley Lane, Leeds LS17 8JQ, United Kingdom.

出版信息

J Chem Inf Model. 2009 Apr;49(4):780-7. doi: 10.1021/ci800449t.

Abstract

We present CLiDE Pro, the latest version of the output of the long-term CLiDE project for the development of tools for automatic extraction of chemical information from the literature. CLiDE Pro is concerned with the extraction of chemical structure and generic structure information from electronic images of chemical molecules available online as well as pages of scanned chemical documents. The information is extracted in three phases, first the image is segmented into text and graphical regions, then graphical regions are analyzed and where possible the connection tables are reconstructed, and finally any generic structures are interpreted by matching R-groups found in structure diagrams with the ones located in the text. The program has been tested on a large set of images of chemical structures originating from various sources. The results demonstrate good performance in the reconstruction of connection tables with few errors in the interpretation of the individual drawing features found in the structure diagrams. This full test set is presented for use in the validation of other similar systems.

摘要

我们展示了CLiDE Pro,这是长期CLiDE项目的最新成果,该项目旨在开发从文献中自动提取化学信息的工具。CLiDE Pro关注从在线提供的化学分子电子图像以及扫描的化学文档页面中提取化学结构和通用结构信息。信息提取分三个阶段进行,首先将图像分割为文本和图形区域,然后分析图形区域,并在可能的情况下重建连接表,最后通过将结构图中找到的R基团与文本中定位的R基团进行匹配来解释任何通用结构。该程序已在大量源自各种来源的化学结构图像上进行了测试。结果表明该程序在重建连接表方面表现良好,在解释结构图中发现的各个绘图特征时错误较少。这个完整的测试集可供用于验证其他类似系统。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验