Suppr超能文献

一种用于检测和定位自然场景图像中文本的混合方法。

A hybrid approach to detect and localize texts in natural scene images.

机构信息

National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing 100190, China.

出版信息

IEEE Trans Image Process. 2011 Mar;20(3):800-13. doi: 10.1109/TIP.2010.2070803. Epub 2010 Sep 2.

Abstract

Text detection and localization in natural scene images is important for content-based image analysis. This problem is challenging due to the complex background, the non-uniform illumination, the variations of text font, size and line orientation. In this paper, we present a hybrid approach to robustly detect and localize texts in natural scene images. A text region detector is designed to estimate the text existing confidence and scale information in image pyramid, which help segment candidate text components by local binarization. To efficiently filter out the non-text components, a conditional random field (CRF) model considering unary component properties and binary contextual component relationships with supervised parameter learning is proposed. Finally, text components are grouped into text lines/words with a learning-based energy minimization method. Since all the three stages are learning-based, there are very few parameters requiring manual tuning. Experimental results evaluated on the ICDAR 2005 competition dataset show that our approach yields higher precision and recall performance compared with state-of-the-art methods. We also evaluated our approach on a multilingual image dataset with promising results.

摘要

文本检测和定位在自然场景图像中对于基于内容的图像分析非常重要。由于复杂的背景、不均匀的光照、文本字体、大小和行方向的变化,这个问题具有挑战性。在本文中,我们提出了一种混合方法来稳健地检测和定位自然场景图像中的文本。设计了一个文本区域检测器来估计图像金字塔中存在的文本置信度和尺度信息,这有助于通过局部二值化分割候选文本组件。为了有效地过滤掉非文本组件,提出了一种考虑一元组件属性和二元上下文组件关系的条件随机场(CRF)模型,并进行了有监督的参数学习。最后,使用基于学习的能量最小化方法将文本组件组合成文本行/单词。由于所有三个阶段都是基于学习的,因此需要手动调整的参数很少。在 ICDAR 2005 竞赛数据集上的实验结果表明,与最先进的方法相比,我们的方法具有更高的精度和召回性能。我们还在一个多语言图像数据集上评估了我们的方法,取得了有前景的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验