Suppr超能文献

ConBGAT:一种结合卷积神经网络、Transformer和图注意力网络的新型模型,用于从扫描图像中提取信息。

ConBGAT: a novel model combining convolutional neural networks, transformer and graph attention network for information extraction from scanned image.

作者信息

Ho Vo Hoang Duy, Vo Quoc Huy, Hung Bui Thanh

机构信息

Data Science Laboratory/Data Science Department/Faculty of Information Technology, Industrial University of Ho Chi Minh City, Ho Chi Minh, Vietnam.

出版信息

PeerJ Comput Sci. 2024 Nov 28;10:e2536. doi: 10.7717/peerj-cs.2536. eCollection 2024.

Abstract

Extracting information from scanned images is a critical task with far-reaching practical implications. Traditional methods often fall short by inadequately leveraging both image and text features, leading to less accurate and efficient outcomes. In this study, we introduce ConBGAT, a cutting-edge model that seamlessly integrates convolutional neural networks (CNNs), Transformers, and graph attention networks to address these shortcomings. Our approach constructs detailed graphs from text regions within images, utilizing advanced Optical Character Recognition to accurately detect and interpret characters. By combining superior extracted features of CNNs for image and Distilled Bidirectional Encoder Representations from Transformers (DistilBERT) for text, our model achieves a comprehensive and efficient data representation. Rigorous testing on real-world datasets shows that ConBGAT significantly outperforms existing methods, demonstrating its superior capability across multiple evaluation metrics. This advancement not only enhances accuracy but also sets a new benchmark for information extraction in scanned image.

摘要

从扫描图像中提取信息是一项具有深远实际意义的关键任务。传统方法往往因无法充分利用图像和文本特征而有所不足,导致结果的准确性和效率较低。在本研究中,我们引入了ConBGAT,这是一种前沿模型,它无缝集成了卷积神经网络(CNN)、Transformer和图注意力网络来解决这些缺点。我们的方法从图像中的文本区域构建详细的图,利用先进的光学字符识别技术准确检测和解释字符。通过结合CNN用于图像的卓越提取特征和Transformer的蒸馏双向编码器表示(DistilBERT)用于文本,我们的模型实现了全面而高效的数据表示。在真实世界数据集上的严格测试表明,ConBGAT显著优于现有方法,在多个评估指标上展示了其卓越能力。这一进展不仅提高了准确性,还为扫描图像中的信息提取设定了新的基准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7a0/11622835/6c1003c65648/peerj-cs-10-2536-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验