Suppr超能文献

图像到国际化合物标识:自动分子光学图像识别

Image2InChI: Automated Molecular Optical Image Recognition.

作者信息

Li Da-Zhou, Xu Xin, Pan Jia-Heng, Gao Wei, Zhang Shi-Rui

机构信息

College of Computer Science and Technology, Shenyang University of Chemical Technology, Shenyang 110000, China.

出版信息

J Chem Inf Model. 2024 May 13;64(9):3640-3649. doi: 10.1021/acs.jcim.3c02082. Epub 2024 Feb 15.

Abstract

The accurate identification and analysis of chemical structures in molecular images are prerequisites of artificial intelligence for drug discovery. It is important to efficiently and automatically convert molecular images into machine-readable representations. Therefore, in this paper, we propose an automated molecular optical image recognition model based on deep learning, called Image2InChI. Additionally, the proposed Image2InChI introduces a novel feature fusion network with attention to integrate image patch and InChI prediction. The improved SwinTransformer as an encoder and the Transformer Decoder as a decoder with patch embedding are applied to predict the image features for the corresponding InChI. The experimental results showed that the Image2InChI model achieves an accuracy of InChI (InChI acc) of 99.8%, a Morgan FP of 94.1%, an accuracy of maximum common structures (MCS acc) of 94.8%, and an accuracy of longest common subsequence (LCS acc) of 96.2%. The experiments demonstrated that the proposed Image2InChI model improves the accuracy and efficiency of molecular image recognition and provided a valuable reference about optical chemical structure recognition for InChI.

摘要

分子图像中化学结构的准确识别与分析是药物发现人工智能的先决条件。将分子图像高效且自动地转换为机器可读表示非常重要。因此,在本文中,我们提出了一种基于深度学习的自动分子光学图像识别模型,称为Image2InChI。此外,所提出的Image2InChI引入了一种带有注意力机制的新型特征融合网络,以整合图像块和InChI预测。改进的SwinTransformer作为编码器,Transformer解码器作为带有补丁嵌入的解码器,用于预测相应InChI的图像特征。实验结果表明,Image2InChI模型的InChI准确率(InChI acc)达到99.8%,摩根指纹(Morgan FP)达到94.1%,最大公共结构准确率(MCS acc)达到94.8%,最长公共子序列准确率(LCS acc)达到96.2%。实验证明,所提出的Image2InChI模型提高了分子图像识别的准确率和效率,并为InChI的光学化学结构识别提供了有价值的参考。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验