• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MolNexTR:一种用于分子图像识别的通用深度学习模型。

MolNexTR: a generalized deep learning model for molecular image recognition.

作者信息

Chen Yufan, Leung Ching Ting, Huang Yong, Sun Jianwei, Chen Hao, Gao Hanyu

机构信息

Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, SAR, China.

Department of Chemistry, Hong Kong University of Science and Technology, Hong Kong, SAR, China.

出版信息

J Cheminform. 2024 Dec 18;16(1):141. doi: 10.1186/s13321-024-00926-w.

DOI:10.1186/s13321-024-00926-w
PMID:39696616
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11654183/
Abstract

In the field of chemical structure recognition, the task of converting molecular images into machine-readable data formats such as SMILES string stands as a significant challenge, primarily due to the varied drawing styles and conventions prevalent in chemical literature. To bridge this gap, we proposed MolNexTR, a novel image-to-graph deep learning model that collaborates to fuse the strengths of ConvNext, a powerful Convolutional Neural Network variant, and Vision-TRansformer. This integration facilitates a more detailed extraction of both local and global features from molecular images. MolNexTR can predict atoms and bonds simultaneously and understand their layout rules. It also excels at flexibly integrating symbolic chemistry principles to discern chirality and decipher abbreviated structures. We further incorporate a series of advanced algorithms, including an improved data augmentation module, an image contamination module, and a post-processing module for getting the final SMILES output. These modules cooperate to enhance the model's robustness to diverse styles of molecular images found in real literature. In our test sets, MolNexTR has demonstrated superior performance, achieving an accuracy rate of 81-97%, marking a significant advancement in the domain of molecular structure recognition.Scientific contributionMolNexTR is a novel image-to-graph model that incorporates a unique dual-stream encoder to extract complex molecular image features, and combines chemical rules to predict atoms and bonds while understanding atom and bond layout rules. In addition, it employs a series of novel augmentation algorithms to significantly enhance the robustness and performance of the model.

摘要

在化学结构识别领域,将分子图像转换为机器可读的数据格式(如SMILES字符串)的任务是一项重大挑战,主要原因在于化学文献中普遍存在的多样绘图风格和惯例。为了弥合这一差距,我们提出了MolNexTR,这是一种新颖的图像到图形的深度学习模型,它融合了强大的卷积神经网络变体ConvNext和视觉Transformer的优势。这种整合有助于从分子图像中更详细地提取局部和全局特征。MolNexTR可以同时预测原子和键,并理解它们的布局规则。它还擅长灵活整合符号化学原理以辨别手性并解读缩写结构。我们进一步纳入了一系列先进算法,包括改进的数据增强模块、图像污染模块以及用于获得最终SMILES输出的后处理模块。这些模块协同工作,以增强模型对真实文献中发现的各种分子图像样式的鲁棒性。在我们的测试集中,MolNexTR表现出卓越的性能,准确率达到81 - 97%,标志着分子结构识别领域的重大进步。

科学贡献

MolNexTR是一种新颖的图像到图形模型,它采用独特的双流编码器来提取复杂的分子图像特征,并结合化学规则来预测原子和键,同时理解原子和键的布局规则。此外,它采用了一系列新颖的增强算法,显著提高了模型的鲁棒性和性能。

相似文献

1
MolNexTR: a generalized deep learning model for molecular image recognition.MolNexTR:一种用于分子图像识别的通用深度学习模型。
J Cheminform. 2024 Dec 18;16(1):141. doi: 10.1186/s13321-024-00926-w.
2
MolScribe: Robust Molecular Structure Recognition with Image-to-Graph Generation.MolScribe:通过图像到图形生成实现强大的分子结构识别。
J Chem Inf Model. 2023 Apr 10;63(7):1925-1934. doi: 10.1021/acs.jcim.2c01480. Epub 2023 Mar 27.
3
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
4
ChemReco: automated recognition of hand-drawn carbon-hydrogen-oxygen structures using deep learning.ChemReco:利用深度学习对手绘碳氢氧结构进行自动识别
Sci Rep. 2024 Jul 25;14(1):17126. doi: 10.1038/s41598-024-67496-7.
5
Enhanced Pneumonia Detection in Chest X-Rays Using Hybrid Convolutional and Vision Transformer Networks.使用混合卷积和视觉Transformer网络增强胸部X光片中的肺炎检测
Curr Med Imaging. 2025;21:e15734056326685. doi: 10.2174/0115734056326685250101113959.
6
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
7
TAC-UNet: transformer-assisted convolutional neural network for medical image segmentation.TAC-UNet:用于医学图像分割的Transformer辅助卷积神经网络。
Quant Imaging Med Surg. 2024 Dec 5;14(12):8824-8839. doi: 10.21037/qims-24-1229. Epub 2024 Nov 5.
8
Hand gesture recognition using sEMG signals with a multi-stream time-varying feature enhancement approach.基于多流时变特征增强方法的 sEMG 信号手势识别。
Sci Rep. 2024 Sep 27;14(1):22061. doi: 10.1038/s41598-024-72996-7.
9
Conv-Swinformer: Integration of CNN and shift window attention for Alzheimer's disease classification.卷积 Swinformer:用于阿尔茨海默病分类的 CNN 和窗口移位注意力集成。
Comput Biol Med. 2023 Sep;164:107304. doi: 10.1016/j.compbiomed.2023.107304. Epub 2023 Jul 31.
10
Positional embeddings and zero-shot learning using BERT for molecular-property prediction.使用BERT进行位置嵌入和零样本学习以预测分子性质
J Cheminform. 2025 Feb 5;17(1):17. doi: 10.1186/s13321-025-00959-9.

本文引用的文献

1
MolScribe: Robust Molecular Structure Recognition with Image-to-Graph Generation.MolScribe:通过图像到图形生成实现强大的分子结构识别。
J Chem Inf Model. 2023 Apr 10;63(7):1925-1934. doi: 10.1021/acs.jcim.2c01480. Epub 2023 Mar 27.
2
MolMiner: You Only Look Once for Chemical Structure Recognition.MolMiner:只看一次的化学结构识别。
J Chem Inf Model. 2022 Nov 28;62(22):5321-5328. doi: 10.1021/acs.jcim.2c00733. Epub 2022 Sep 15.
3
SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer.
SwinOCSR:使用Swin Transformer进行端到端光学化学结构识别
J Cheminform. 2022 Jul 1;14(1):41. doi: 10.1186/s13321-022-00624-5.
4
Img2Mol - accurate SMILES recognition from molecular graphical depictions.Img2Mol - 从分子图形描绘中准确识别SMILES
Chem Sci. 2021 Sep 29;12(42):14174-14181. doi: 10.1039/d1sc01839f. eCollection 2021 Nov 3.
5
ReactionDataExtractor: A Tool for Automated Extraction of Information from Chemical Reaction Schemes.反应数据提取器:一种从化学反应图中自动提取信息的工具。
J Chem Inf Model. 2021 Oct 25;61(10):4962-4974. doi: 10.1021/acs.jcim.1c01017. Epub 2021 Sep 15.
6
DECIMER 1.0: deep learning for chemical image recognition using transformers.DECIMER 1.0:使用Transformer进行化学图像识别的深度学习
J Cheminform. 2021 Aug 17;13(1):61. doi: 10.1186/s13321-021-00538-8.
7
Automated Chemical Reaction Extraction from Scientific Literature.从科学文献中自动提取化学反应
J Chem Inf Model. 2022 May 9;62(9):2035-2045. doi: 10.1021/acs.jcim.1c00284. Epub 2021 Jun 11.
8
A review of optical chemical structure recognition tools.光学化学结构识别工具综述。
J Cheminform. 2020 Oct 7;12(1):60. doi: 10.1186/s13321-020-00465-0.
9
DECIMER: towards deep learning for chemical image recognition.DECIMER:迈向用于化学图像识别的深度学习
J Cheminform. 2020 Oct 27;12(1):65. doi: 10.1186/s13321-020-00469-w.
10
ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning.ChemGrapher:基于深度学习的化学化合物光学图形识别。
J Chem Inf Model. 2020 Oct 26;60(10):4506-4517. doi: 10.1021/acs.jcim.0c00459. Epub 2020 Sep 24.