• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

化学扫描器:从包含ChemDraw文件的常见科学文档中提取化学信息并进行重复使用(能力)。

CHEMSCANNER: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files.

作者信息

Nguyen An, Huang Yu-Chieh, Tremouilhac Pierre, Jung Nicole, Bräse Stefan

机构信息

Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany.

Institute of Organic Chemistry, Karlsruhe Institute of Technology, Fritz-Haber-Weg 6, 76131, Karlsruhe, Germany.

出版信息

J Cheminform. 2019 Dec 11;11(1):77. doi: 10.1186/s13321-019-0400-5.

DOI:10.1186/s13321-019-0400-5
PMID:33431008
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6907231/
Abstract

We developed CHEMSCANNER, a software that can be used for the extraction of chemical information from ChemDraw binary (CDX) or ChemDraw XML-based (CDXML) files and to retrieve the ChemDraw scheme from DOC, DOCX or XML documents. This can facilitate the reuse of chemical information embedded into diverse documents used as standard storage and communication instrument in chemical sciences (e.g. for student's theses, PhD theses, or publications). The extracted information is processed to reactions, molecules, as well as additional text and values and can be accessed via the CHEMSCANNER UI. CHEMSCANNER supports the export to Excel and CML, the direct import of the extracted data to the Open Source ELN Chemotion or the use via "copy and paste" of selected information. The software was designed with a focus on the processing of documents with embedded molecular structure information as CDX or CDXML as these are the most common file formats for chemical drawings. The project aims to support the chemists in their efforts to re-use chemistry research data by providing them missing tools for an automated assembly of reaction data.

摘要

我们开发了CHEMSCANNER软件,它可用于从ChemDraw二进制(CDX)或基于ChemDraw XML(CDXML)的文件中提取化学信息,并从DOC、DOCX或XML文档中检索ChemDraw方案。这有助于重新利用嵌入到化学科学中用作标准存储和交流工具的各种文档中的化学信息(例如学生论文、博士论文或出版物)。提取的信息被处理为反应、分子以及其他文本和值,并可通过CHEMSCANNER用户界面访问。CHEMSCANNER支持导出到Excel和CML,将提取的数据直接导入开源电子实验室笔记软件Chemotion或通过“复制粘贴”选定信息来使用。该软件的设计重点是处理具有嵌入分子结构信息(如CDX或CDXML)的文档,因为这些是化学绘图最常见的文件格式。该项目旨在通过为化学家提供自动组装反应数据所需的缺失工具,支持他们重新利用化学研究数据的努力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/b979aa92eb0a/13321_2019_400_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/44b75b476f21/13321_2019_400_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/f9d7c6db9725/13321_2019_400_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/ef586c33b5a7/13321_2019_400_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/7659bec0178e/13321_2019_400_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/b979aa92eb0a/13321_2019_400_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/44b75b476f21/13321_2019_400_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/f9d7c6db9725/13321_2019_400_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/ef586c33b5a7/13321_2019_400_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/7659bec0178e/13321_2019_400_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5def/6907231/b979aa92eb0a/13321_2019_400_Fig5_HTML.jpg

相似文献

1
CHEMSCANNER: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files.化学扫描器:从包含ChemDraw文件的常见科学文档中提取化学信息并进行重复使用(能力)。
J Cheminform. 2019 Dec 11;11(1):77. doi: 10.1186/s13321-019-0400-5.
2
[Construction of chemical information database based on optical structure recognition technique].基于光学结构识别技术的化学信息数据库构建
Beijing Da Xue Xue Bao Yi Xue Ban. 2018 Apr 18;50(2):352-357.
3
Chemotion ELN: an Open Source electronic lab notebook for chemists in academia.Chemotion电子实验室笔记本:一款面向学术界化学家的开源电子实验室笔记本。
J Cheminform. 2017 Sep 25;9(1):54. doi: 10.1186/s13321-017-0240-0.
4
SPECTRa-T: machine-based data extraction and semantic searching of chemistry e-theses.SPECTRa-T:基于机器的数据提取和化学电子论文的语义搜索。
J Chem Inf Model. 2010 Feb 22;50(2):251-61. doi: 10.1021/ci9003688.
5
Chemotion-ELN part 2: adaption of an embedded Ketcher editor to advanced research applications.Chemotion-ELN 第 2 部分:将嵌入式 Ketcher 编辑器应用于高级研究应用
J Cheminform. 2018 Aug 13;10(1):38. doi: 10.1186/s13321-018-0292-9.
6
ChemSpectra: a web-based spectra editor for analytical data.ChemSpectra:一款用于分析数据的基于网络的光谱编辑器。
J Cheminform. 2021 Feb 10;13(1):8. doi: 10.1186/s13321-020-00481-0.
7
PubMedPortable: A Framework for Supporting the Development of Text Mining Applications.PubMed便携式:支持文本挖掘应用开发的框架。
PLoS One. 2016 Oct 5;11(10):e0163794. doi: 10.1371/journal.pone.0163794. eCollection 2016.
8
DECIMER-Segmentation: Automated extraction of chemical structure depictions from scientific literature.DECIMER-分割:从科学文献中自动提取化学结构描绘。
J Cheminform. 2021 Mar 8;13(1):20. doi: 10.1186/s13321-021-00496-1.
9
Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML and PDF Documents: Improving Data Access and Visualization for Veterinarians.从HTML和PDF文档中对快速残留检测分析数据进行大规模数据挖掘:改善兽医的数据访问和可视化
Front Vet Sci. 2021 Jul 21;8:674730. doi: 10.3389/fvets.2021.674730. eCollection 2021.
10
Software review: The JATSdecoder package-extract metadata, abstract and sectioned text from NISO-JATS coded XML documents; Insights to PubMed central's open access database.软件综述:JATSdecoder软件包——从NISO-JATS编码的XML文档中提取元数据、摘要和分节文本;对PubMed中央开放获取数据库的见解。
Scientometrics. 2021;126(12):9585-9601. doi: 10.1007/s11192-021-04162-z. Epub 2021 Oct 24.

引用本文的文献

1
Implementation of an open chemistry knowledge base with a Semantic Wiki.使用语义维基实现一个开放化学知识库。
J Cheminform. 2025 Jul 6;17(1):99. doi: 10.1186/s13321-025-01037-w.
2
3D-QSAR, Scaffold Hopping, Virtual Screening, and Molecular Dynamics Simulations of Pyridin-2-one as mIDH1 Inhibitors.基于吡啶-2-酮的三维定量构效关系、骨架跃迁、虚拟筛选和分子动力学模拟研究作为 mIDH1 抑制剂。
Int J Mol Sci. 2024 Jul 6;25(13):7434. doi: 10.3390/ijms25137434.
3
Genistein exerts anti-colorectal cancer actions: clinical reports, computational and validated findings.

本文引用的文献

1
Chemotion-ELN part 2: adaption of an embedded Ketcher editor to advanced research applications.Chemotion-ELN 第 2 部分:将嵌入式 Ketcher 编辑器应用于高级研究应用
J Cheminform. 2018 Aug 13;10(1):38. doi: 10.1186/s13321-018-0292-9.
2
Chemotion ELN: an Open Source electronic lab notebook for chemists in academia.Chemotion电子实验室笔记本:一款面向学术界化学家的开源电子实验室笔记本。
J Cheminform. 2017 Sep 25;9(1):54. doi: 10.1186/s13321-017-0240-0.
3
Information Retrieval and Text Mining Technologies for Chemistry.化学信息检索与文本挖掘技术。
染料木黄酮发挥抗结直肠癌作用:临床报告、计算和验证结果。
Aging (Albany NY). 2023 May 7;15(9):3678-3689. doi: 10.18632/aging.204702.
4
Molecular representations in AI-driven drug discovery: a review and practical guide.人工智能驱动的药物发现中的分子表征:综述与实践指南
J Cheminform. 2020 Sep 17;12(1):56. doi: 10.1186/s13321-020-00460-5.
Chem Rev. 2017 Jun 28;117(12):7673-7761. doi: 10.1021/acs.chemrev.6b00851. Epub 2017 May 5.
4
SureChEMBL: a large-scale, chemically annotated patent document database.SureChEMBL:一个大规模的、经过化学注释的专利文献数据库。
Nucleic Acids Res. 2016 Jan 4;44(D1):D1220-8. doi: 10.1093/nar/gkv1253. Epub 2015 Nov 17.
5
CHEMDNER: The drugs and chemical names extraction challenge.CHEMDNER:药物和化学名称提取挑战赛。
J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S1. doi: 10.1186/1758-2946-7-S1-S1. eCollection 2015.
6
JSME: a free molecule editor in JavaScript.JSME:一个用 JavaScript 编写的免费分子编辑器。
J Cheminform. 2013 May 21;5:24. doi: 10.1186/1758-2946-5-24. eCollection 2013.
7
Extracting and connecting chemical structures from text sources using chemicalize.org.使用 chemicalize.org 从文本来源中提取和连接化学结构。
J Cheminform. 2013 Apr 23;5(1):20. doi: 10.1186/1758-2946-5-20.
8
OSCAR4: a flexible architecture for chemical text-mining.OSCAR4:一种用于化学文本挖掘的灵活架构。
J Cheminform. 2011 Oct 14;3(1):41. doi: 10.1186/1758-2946-3-41.
9
Open Babel: An open chemical toolbox.Open Babel:一个开放的化学工具箱。
J Cheminform. 2011 Oct 7;3:33. doi: 10.1186/1758-2946-3-33.
10
ChemicalTagger: A tool for semantic text-mining in chemistry.化学标签工具:化学领域语义文本挖掘工具。
J Cheminform. 2011 May 16;3(1):17. doi: 10.1186/1758-2946-3-17.