Xu Yongchao, Wang Yukang, Zhou Wei, Wang Yongpan, Yang Zhibo, Bai Xiang
IEEE Trans Image Process. 2019 Nov;28(11):5566-5579. doi: 10.1109/TIP.2019.2900589. Epub 2019 Feb 21.
Scene text detection is an important step in the scene text reading system. The main challenges lie in significantly varied sizes and aspect ratios, arbitrary orientations, and shapes. Driven by the recent progress in deep learning, impressive performances have been achieved for multi-oriented text detection. Yet, the performance drops dramatically in detecting the curved texts due to the limited text representation (e.g., horizontal bounding boxes, rotated rectangles, or quadrilaterals). It is of great interest to detect the curved texts, which are actually very common in natural scenes. In this paper, we present a novel text detector named TextField for detecting irregular scene texts. Specifically, we learn a direction field pointing away from the nearest text boundary to each text point. This direction field is represented by an image of 2D vectors and learned via a fully convolutional neural network. It encodes both binary text mask and direction information used to separate adjacent text instances, which is challenging for the classical segmentation-based approaches. Based on the learned direction field, we apply a simple yet effective morphological-based post-processing to achieve the final detection. The experimental results show that the proposed TextField outperforms the state-of-the-art methods by a large margin (28% and 8%) on two curved text datasets: Total-Text and SCUT-CTW1500, respectively; TextField also achieves very competitive performance on multi-oriented datasets: ICDAR 2015 and MSRA-TD500. Furthermore, TextField is robust in generalizing unseen datasets.
场景文本检测是场景文本阅读系统中的重要一步。主要挑战在于文本的大小和宽高比差异极大、方向任意以及形状各异。受深度学习近期进展的推动,多方向文本检测已取得令人瞩目的成果。然而,由于文本表示方式有限(例如水平边界框、旋转矩形或四边形),在检测弯曲文本时性能会大幅下降。检测弯曲文本非常有趣,因为它们在自然场景中实际上很常见。在本文中,我们提出了一种名为TextField的新型文本检测器,用于检测不规则场景文本。具体而言,我们学习一个从每个文本点指向最近文本边界的方向场。这个方向场由一个二维向量图像表示,并通过全卷积神经网络学习得到。它对用于分离相邻文本实例的二进制文本掩码和方向信息进行编码,这对基于经典分割的方法来说具有挑战性。基于学习到的方向场,我们应用一种简单而有效的基于形态学的后处理来实现最终检测。实验结果表明,所提出的TextField在两个弯曲文本数据集Total-Text和SCUT-CTW1500上分别比现有最先进方法大幅领先(28%和8%);TextField在多方向数据集ICDAR 2015和MSRA-TD500上也取得了极具竞争力的性能。此外,TextField在泛化未见过的数据集方面具有鲁棒性。
IEEE Trans Image Process. 2019-11
IEEE Trans Image Process. 2019-11-26
IEEE Trans Pattern Anal Mach Intell. 2023-3
IEEE Trans Image Process. 2018-8
Sensors (Basel). 2021-1-28
IEEE Trans Image Process. 2022
IEEE Trans Pattern Anal Mach Intell. 2015-9
IEEE Trans Pattern Anal Mach Intell. 2021-2
IEEE Trans Image Process. 2015-8-5
IEEE Trans Image Process. 2022
Entropy (Basel). 2024-6-29
Sensors (Basel). 2023-1-17
Sensors (Basel). 2022-12-18
Sensors (Basel). 2022-8-20