Suppr超能文献

边界文本检测:迈向任意形状场景文本检测

Boundary TextSpotter: Toward Arbitrary-Shaped Scene Text Spotting.

作者信息

Lu Pu, Wang Hao, Zhu Shenggao, Wang Jing, Bai Xiang, Liu Wenyu

出版信息

IEEE Trans Image Process. 2022;31:6200-6212. doi: 10.1109/TIP.2022.3206615. Epub 2022 Sep 28.

Abstract

Reading arbitrary-shaped text in an end-to-end fashion has received particularly growing interested in computer vision. In this paper, we study the problem of scene text spotting, which aims to detect and recognize text from cluttered images simultaneously and propose an end-to-end trainable neural network named Boundary TextSpotter. Different from existing methods that describe the shape of text instance with bounding box or shape mask, Boundary TextSpotter formulates it as a set of boundary points. Besides, the representation of such boundary points provides the order of reading text. Benefiting from the representation on both detection and recognition, Boundary TextSpotter can easily deal with the text of arbitrary shapes. Further, to efficiently detect the boundary points of the text, a single-stage text detector is proposed, which can almost perform at a real-time speed. Experiments on three challenging datasets, including ICDAR2015, Total-Text and CTW1500 demonstrate that the proposed method achieves state-of-the-art or competitive results, meanwhile significantly improving the inference speed.

摘要

以端到端的方式读取任意形状的文本在计算机视觉领域受到了越来越多的关注。在本文中,我们研究场景文本定位问题,该问题旨在同时从杂乱图像中检测和识别文本,并提出了一种名为边界文本定位器的端到端可训练神经网络。与现有方法用边界框或形状掩码描述文本实例的形状不同,边界文本定位器将其表述为一组边界点。此外,这种边界点的表示提供了读取文本的顺序。受益于在检测和识别方面的表示,边界文本定位器能够轻松处理任意形状的文本。此外,为了有效地检测文本的边界点,提出了一种单阶段文本检测器,它几乎可以实时运行。在包括ICDAR2015、Total-Text和CTW1500在内的三个具有挑战性的数据集上进行的实验表明,所提出的方法取得了领先或具有竞争力的结果,同时显著提高了推理速度。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验