Suppr超能文献

面向自然场景的端到端文本检测。

Towards End-to-End Text Spotting in Natural Scenes.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7266-7281. doi: 10.1109/TPAMI.2021.3095916. Epub 2022 Sep 14.

Abstract

Text spotting in natural scene images is of great importance for many image understanding tasks. It includes two sub-tasks: text detection and recognition. In this work, we propose a unified network that simultaneously localizes and recognizes text with a single forward pass, avoiding intermediate processes such as image cropping and feature re-calculation, word separation, and character grouping. The overall framework is trained end-to-end and is able to spot text of arbitrary shapes. The convolutional features are calculated only once and shared by both the detection and recognition modules. Through multi-task training, the learned features become more discriminative and improve the overall performance. By employing a 2D attention model in word recognition, the issue of text irregularity is robustly addressed. The attention model provides the spatial location for each character, which not only helps local feature extraction in word recognition, but also indicates an orientation angle to refine text localization. Experiments demonstrate that our proposed method can achieve state-of-the-art performance on several commonly used text spotting benchmarks, including both regular and irregular datasets. Extensive ablation experiments are performed to verify the effectiveness of each module design.

摘要

文本在自然场景图像中的定位对于许多图像理解任务至关重要。它包括两个子任务:文本检测和识别。在这项工作中,我们提出了一个统一的网络,该网络可以通过单个前向传递同时定位和识别文本,避免了图像裁剪和特征重新计算、单词分离和字符分组等中间过程。整体框架是端到端训练的,可以定位任意形状的文本。卷积特征仅计算一次,并由检测和识别模块共享。通过多任务训练,学习到的特征更加具有判别力,从而提高了整体性能。通过在单词识别中使用 2D 注意力模型,稳健地解决了文本不规则的问题。注意力模型为每个字符提供了空间位置,这不仅有助于单词识别中的局部特征提取,还指示了一个角度来细化文本定位。实验表明,我们提出的方法可以在几个常用的文本定位基准上实现最先进的性能,包括规则和不规则数据集。进行了广泛的消融实验来验证每个模块设计的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验