Suppr超能文献

从数字光栅图像中自动提取化学结构信息。

Automated extraction of chemical structure information from digital raster images.

作者信息

Park Jungkap, Rosania Gus R, Shedden Kerby A, Nguyen Mandee, Lyu Naesung, Saitou Kazuhiro

机构信息

Michigan Alliance for Cheminformatic Exploration, Ann Arbor, MI, USA.

出版信息

Chem Cent J. 2009 Feb 5;3:4. doi: 10.1186/1752-153X-3-4.

Abstract

BACKGROUND

To search for chemical structures in research articles, diagrams or text representing molecules need to be translated to a standard chemical file format compatible with cheminformatic search engines. Nevertheless, chemical information contained in research articles is often referenced as analog diagrams of chemical structures embedded in digital raster images. To automate analog-to-digital conversion of chemical structure diagrams in scientific research articles, several software systems have been developed. But their algorithmic performance and utility in cheminformatic research have not been investigated.

RESULTS

This paper aims to provide critical reviews for these systems and also report our recent development of ChemReader - a fully automated tool for extracting chemical structure diagrams in research articles and converting them into standard, searchable chemical file formats. Basic algorithms for recognizing lines and letters representing bonds and atoms in chemical structure diagrams can be independently run in sequence from a graphical user interface-and the algorithm parameters can be readily changed-to facilitate additional development specifically tailored to a chemical database annotation scheme. Compared with existing software programs such as OSRA, Kekule, and CLiDE, our results indicate that ChemReader outperforms other software systems on several sets of sample images from diverse sources in terms of the rate of correct outputs and the accuracy on extracting molecular substructure patterns.

CONCLUSION

The availability of ChemReader as a cheminformatic tool for extracting chemical structure information from digital raster images allows research and development groups to enrich their chemical structure databases by annotating the entries with published research articles. Based on its stable performance and high accuracy, ChemReader may be sufficiently accurate for annotating the chemical database with links to scientific research articles.

摘要

背景

为了在研究文章中搜索化学结构,需要将表示分子的图表或文本转换为与化学信息检索引擎兼容的标准化学文件格式。然而,研究文章中包含的化学信息通常以嵌入数字光栅图像中的化学结构模拟图的形式引用。为了实现科研文章中化学结构图的模拟到数字的自动转换,已经开发了几个软件系统。但它们在化学信息学研究中的算法性能和实用性尚未得到研究。

结果

本文旨在对这些系统进行批判性评价,并报告我们最近开发的ChemReader——一种用于提取研究文章中的化学结构图并将其转换为标准的、可搜索的化学文件格式的全自动工具。识别化学结构图中表示键和原子的线条和字母的基本算法可以从图形用户界面按顺序独立运行,并且算法参数可以很容易地更改,以促进专门针对化学数据库注释方案的进一步开发。与现有软件程序如OSRA、Kekule和CLiDE相比,我们的结果表明,在来自不同来源的几组样本图像上,ChemReader在正确输出率和提取分子子结构模式的准确性方面优于其他软件系统。

结论

ChemReader作为一种从数字光栅图像中提取化学结构信息的化学信息学工具,其可用性使研究和开发团队能够通过用已发表的研究文章注释条目来丰富其化学结构数据库。基于其稳定的性能和高精度,ChemReader对于用与科研文章的链接注释化学数据库可能足够准确。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ed6/2648963/1911900512b1/1752-153X-3-4-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验