Suppr超能文献

用于场景文本识别的文本字体校正与对齐方法

Text Font Correction and Alignment Method for Scene Text Recognition.

作者信息

Ding Liuxu, Liu Yuefeng, Zhao Qiyan, Liu Yunong

机构信息

School of Digital and Intelligent Industry, Inner Mongolia University of Science and Technology, Baotou 014010, China.

出版信息

Sensors (Basel). 2024 Dec 11;24(24):7917. doi: 10.3390/s24247917.

Abstract

Text recognition is a rapidly evolving task with broad practical applications across multiple industries. However, due to the arbitrary-shape text arrangement, irregular text font, and unintended occlusion of font, this remains a challenging task. To handle images with arbitrary-shape text arrangement and irregular text font, we designed the Discriminative Standard Text Font (DSTF) and the Feature Alignment and Complementary Fusion (FACF). To address the unintended occlusion of font, we propose a Dual Attention Serial Module (DASM), which is integrated between residual modules to enhance the focus on text texture. These components improve text recognition by correcting irregular text and aligning it with the original feature extraction, thus complementing the overall recognition process. Additionally, to enhance the study of text recognition in natural scenes, we developed the VBC Chinese dataset under varying lighting conditions, including strong light, weak light, darkness, and other natural environments. Experimental results show that our method achieves competitive performance on the VBC dataset with an accuracy of 90.8% and an overall average accuracy of 93.8%.

摘要

文本识别是一项快速发展的任务,在多个行业都有广泛的实际应用。然而,由于文本排列形状任意、字体不规则以及字体的意外遮挡,这仍然是一项具有挑战性的任务。为了处理文本排列形状任意和字体不规则的图像,我们设计了判别标准文本字体(DSTF)和特征对齐与互补融合(FACF)。为了解决字体的意外遮挡问题,我们提出了一种双注意力串行模块(DASM),它集成在残差模块之间,以增强对文本纹理的关注。这些组件通过校正不规则文本并将其与原始特征提取对齐来改进文本识别,从而补充整个识别过程。此外,为了加强对自然场景中文本识别的研究,我们在不同光照条件下开发了VBC中文数据集,包括强光、弱光、黑暗和其他自然环境。实验结果表明,我们的方法在VBC数据集上取得了具有竞争力的性能,准确率为90.8%,总体平均准确率为93.8%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b05/11679380/8d456fceb1a4/sensors-24-07917-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验