• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过增强的DECIMER架构实现手绘化学结构识别的进展。

Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture.

作者信息

Rajan Kohulan, Brinkhaus Henning Otto, Zielesny Achim, Steinbeck Christoph

机构信息

Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessingstr. 8, 07743, Jena, Germany.

Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665, Recklinghausen, Germany.

出版信息

J Cheminform. 2024 Jul 5;16(1):78. doi: 10.1186/s13321-024-00872-7.

DOI:10.1186/s13321-024-00872-7
PMID:38970120
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11227129/
Abstract

Accurate recognition of hand-drawn chemical structures is crucial for digitising hand-written chemical information in traditional laboratory notebooks or facilitating stylus-based structure entry on tablets or smartphones. However, the inherent variability in hand-drawn structures poses challenges for existing Optical Chemical Structure Recognition (OCSR) software. To address this, we present an enhanced Deep lEarning for Chemical ImagE Recognition (DECIMER) architecture that leverages a combination of Convolutional Neural Networks (CNNs) and Transformers to improve the recognition of hand-drawn chemical structures. The model incorporates an EfficientNetV2 CNN encoder that extracts features from hand-drawn images, followed by a Transformer decoder that converts the extracted features into Simplified Molecular Input Line Entry System (SMILES) strings. Our models were trained using synthetic hand-drawn images generated by RanDepict, a tool for depicting chemical structures with different style elements. A benchmark was performed using a real-world dataset of hand-drawn chemical structures to evaluate the model's performance. The results indicate that our improved DECIMER architecture exhibits a significantly enhanced recognition accuracy compared to other approaches. SCIENTIFIC CONTRIBUTION: The new DECIMER model presented here refines our previous research efforts and is currently the only open-source model tailored specifically for the recognition of hand-drawn chemical structures. The enhanced model performs better in handling variations in handwriting styles, line thicknesses, and background noise, making it suitable for real-world applications. The DECIMER hand-drawn structure recognition model and its source code have been made available as an open-source package under a permissive license.

摘要

准确识别手绘化学结构对于将传统实验室笔记本中的手写化学信息数字化,或便于在平板电脑或智能手机上基于手写笔输入结构至关重要。然而,手绘结构中固有的变异性给现有的光学化学结构识别(OCSR)软件带来了挑战。为了解决这个问题,我们提出了一种增强的化学图像识别深度学习(DECIMER)架构,该架构利用卷积神经网络(CNN)和Transformer的组合来提高对手绘化学结构的识别。该模型包含一个EfficientNetV2 CNN编码器,用于从手绘图像中提取特征,随后是一个Transformer解码器,将提取的特征转换为简化分子输入线性输入系统(SMILES)字符串。我们的模型使用RanDepict生成的合成手绘图像进行训练,RanDepict是一种用于描绘具有不同样式元素的化学结构的工具。使用手绘化学结构的真实世界数据集进行基准测试,以评估模型的性能。结果表明,与其他方法相比,我们改进的DECIMER架构具有显著提高的识别准确率。科学贡献:这里提出的新DECIMER模型改进了我们之前的研究工作,并且是目前唯一专门为识别手绘化学结构量身定制的开源模型。增强后的模型在处理笔迹风格、线条粗细和背景噪声的变化方面表现更好,使其适用于实际应用。DECIMER手绘结构识别模型及其源代码已作为开源包在宽松许可下提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee5/11227129/e3fcde17cb34/13321_2024_872_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee5/11227129/306c67fdb897/13321_2024_872_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee5/11227129/e3fcde17cb34/13321_2024_872_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee5/11227129/306c67fdb897/13321_2024_872_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ee5/11227129/e3fcde17cb34/13321_2024_872_Fig2_HTML.jpg

相似文献

1
Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture.通过增强的DECIMER架构实现手绘化学结构识别的进展。
J Cheminform. 2024 Jul 5;16(1):78. doi: 10.1186/s13321-024-00872-7.
2
DECIMER 1.0: deep learning for chemical image recognition using transformers.DECIMER 1.0:使用Transformer进行化学图像识别的深度学习
J Cheminform. 2021 Aug 17;13(1):61. doi: 10.1186/s13321-021-00538-8.
3
DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications.DECIMER.ai:一个用于科学出版物中光学化学结构自动识别、分割和识别的开放平台。
Nat Commun. 2023 Aug 19;14(1):5045. doi: 10.1038/s41467-023-40782-0.
4
DECIMER-Segmentation: Automated extraction of chemical structure depictions from scientific literature.DECIMER-分割:从科学文献中自动提取化学结构描绘。
J Cheminform. 2021 Mar 8;13(1):20. doi: 10.1186/s13321-021-00496-1.
5
DECIMER-hand-drawn molecule images dataset.DECIMER 手绘分子图像数据集。
J Cheminform. 2022 Jun 9;14(1):36. doi: 10.1186/s13321-022-00620-9.
6
DECIMER: towards deep learning for chemical image recognition.DECIMER:迈向用于化学图像识别的深度学习
J Cheminform. 2020 Oct 27;12(1):65. doi: 10.1186/s13321-020-00469-w.
7
ChemReco: automated recognition of hand-drawn carbon-hydrogen-oxygen structures using deep learning.ChemReco:利用深度学习对手绘碳氢氧结构进行自动识别
Sci Rep. 2024 Jul 25;14(1):17126. doi: 10.1038/s41598-024-67496-7.
8
ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning.ChemPix:利用深度学习对手绘烃类结构进行自动识别
Chem Sci. 2021 Jul 3;12(31):10622-10633. doi: 10.1039/d1sc02957f. eCollection 2021 Aug 11.
9
Context awareness based Sketch-DeepNet architecture for hand-drawn sketches classification and recognition in AIoT.用于人工智能物联网中手绘草图分类与识别的基于上下文感知的Sketch-DeepNet架构
PeerJ Comput Sci. 2023 Apr 27;9:e1186. doi: 10.7717/peerj-cs.1186. eCollection 2023.
10
ABC-Net: a divide-and-conquer based deep learning architecture for SMILES recognition from molecular images.ABC-Net:一种基于分而治之的深度学习架构,用于从分子图像中识别 SMILES。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac033.

本文引用的文献

1
DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications.DECIMER.ai:一个用于科学出版物中光学化学结构自动识别、分割和识别的开放平台。
Nat Commun. 2023 Aug 19;14(1):5045. doi: 10.1038/s41467-023-40782-0.
2
MolScribe: Robust Molecular Structure Recognition with Image-to-Graph Generation.MolScribe:通过图像到图形生成实现强大的分子结构识别。
J Chem Inf Model. 2023 Apr 10;63(7):1925-1934. doi: 10.1021/acs.jcim.2c01480. Epub 2023 Mar 27.
3
Open data and algorithms for open science in AI-driven molecular informatics.
人工智能驱动的分子信息学中用于开放科学的开放数据和算法。
Curr Opin Struct Biol. 2023 Apr;79:102542. doi: 10.1016/j.sbi.2023.102542. Epub 2023 Feb 17.
4
Review of techniques and models used in optical chemical structure recognition in images and scanned documents.图像和扫描文档中光学化学结构识别所使用的技术与模型综述。
J Cheminform. 2022 Sep 9;14(1):61. doi: 10.1186/s13321-022-00642-3.
5
MICER: a pre-trained encoder-decoder architecture for molecular image captioning.MICER:一种用于分子图像字幕生成的预训练编解码器架构。
Bioinformatics. 2022 Sep 30;38(19):4562-4572. doi: 10.1093/bioinformatics/btac545.
6
SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer.SwinOCSR:使用Swin Transformer进行端到端光学化学结构识别
J Cheminform. 2022 Jul 1;14(1):41. doi: 10.1186/s13321-022-00624-5.
7
DECIMER-hand-drawn molecule images dataset.DECIMER 手绘分子图像数据集。
J Cheminform. 2022 Jun 9;14(1):36. doi: 10.1186/s13321-022-00620-9.
8
RanDepict: Random chemical structure depiction generator.RanDepict:随机化学结构描绘生成器。
J Cheminform. 2022 Jun 6;14(1):31. doi: 10.1186/s13321-022-00609-4.
9
ABC-Net: a divide-and-conquer based deep learning architecture for SMILES recognition from molecular images.ABC-Net:一种基于分而治之的深度学习架构,用于从分子图像中识别 SMILES。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac033.
10
Img2Mol - accurate SMILES recognition from molecular graphical depictions.Img2Mol - 从分子图形描绘中准确识别SMILES
Chem Sci. 2021 Sep 29;12(42):14174-14181. doi: 10.1039/d1sc01839f. eCollection 2021 Nov 3.