一种从复杂视频场景中检测和提取叠加文本的新方法。

A new approach for overlay text detection and extraction from complex video scene.

作者信息

Kim Wonjun, Kim Changick

机构信息

Department of Electronic Engineering, Information and Communications University, Daejeon, Korea.

出版信息

IEEE Trans Image Process. 2009 Feb;18(2):401-11. doi: 10.1109/TIP.2008.2008225. Epub 2008 Dec 16.

DOI:10.1109/TIP.2008.2008225

PMID:19095537

Abstract

Overlay text brings important semantic clues in video content analysis such as video information retrieval and summarization, since the content of the scene or the editor's intention can be well represented by using inserted text. Most of the previous approaches to extracting overlay text from videos are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background. In this paper, we propose a novel framework to detect and extract the overlay text from the video scene. Based on our observation that there exist transient colors between inserted text and its adjacent background, a transition map is first generated. Then candidate regions are extracted by a reshaping method and the overlay text regions are determined based on the occurrence of overlay text in each candidate. The detected overlay text regions are localized accurately using the projection of overlay text pixels in the transition map and the text extraction is finally conducted. The proposed method is robust to different character size, position, contrast, and color. It is also language independent. Overlay text region update between frames is also employed to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.

摘要

叠加文本在视频内容分析（如视频信息检索和摘要）中提供了重要的语义线索，因为场景内容或编辑意图可以通过插入的文本得到很好的体现。以前大多数从视频中提取叠加文本的方法都是基于低级特征，如图像边缘、颜色和纹理信息。然而，现有方法在处理具有各种对比度或插入复杂背景中的文本时存在困难。在本文中，我们提出了一种新颖的框架来从视频场景中检测和提取叠加文本。基于我们的观察，即插入文本与其相邻背景之间存在过渡颜色，首先生成一个过渡图。然后通过一种重塑方法提取候选区域，并根据每个候选区域中叠加文本的出现情况确定叠加文本区域。利用过渡图中叠加文本像素的投影对检测到的叠加文本区域进行精确定位，最后进行文本提取。所提出的方法对不同的字符大小、位置、对比度和颜色具有鲁棒性，并且与语言无关。还采用了帧间叠加文本区域更新来减少处理时间。在各种视频上进行了实验，以验证所提出方法的有效性。

相似文献

A new approach for overlay text detection and extraction from complex video scene.

IEEE Trans Image Process. 2009 Feb;18(2):401-11. doi: 10.1109/TIP.2008.2008225. Epub 2008 Dec 16.

Texture for script identification.

IEEE Trans Pattern Anal Mach Intell. 2005 Nov;27(11):1720-32. doi: 10.1109/TPAMI.2005.227.

A parallel-line detection algorithm based on HMM decoding.

IEEE Trans Pattern Anal Mach Intell. 2005 May;27(5):777-92. doi: 10.1109/TPAMI.2005.89.

A novel document ranking method using the discrete cosine transform.

IEEE Trans Pattern Anal Mach Intell. 2005 Jan;27(1):130-5. doi: 10.1109/TPAMI.2005.2.

Script-independent text line segmentation in freestyle handwritten documents.

IEEE Trans Pattern Anal Mach Intell. 2008 Aug;30(8):1313-29. doi: 10.1109/TPAMI.2007.70792.

Design of multimodal dissimilarity spaces for retrieval of video documents.

IEEE Trans Pattern Anal Mach Intell. 2008 Sep;30(9):1520-33. doi: 10.1109/TPAMI.2007.70801.

Motion layer extraction in the presence of occlusion using graph cuts.

IEEE Trans Pattern Anal Mach Intell. 2005 Oct;27(10):1644-59. doi: 10.1109/TPAMI.2005.202.

Machine printed text and handwriting identification in noisy document images.

IEEE Trans Pattern Anal Mach Intell. 2004 Mar;26(3):337-53. doi: 10.1109/TPAMI.2004.1262324.

Bayesian foreground and shadow detection in uncertain frame rate surveillance videos.

IEEE Trans Image Process. 2008 Apr;17(4):608-21. doi: 10.1109/TIP.2008.916989.

Offline recognition of unconstrained handwritten texts using HMMs and statistical language models.

IEEE Trans Pattern Anal Mach Intell. 2004 Jun;26(6):709-20. doi: 10.1109/TPAMI.2004.14.

引用本文的文献

Text string detection from natural scenes by structure-based partition and grouping.

IEEE Trans Image Process. 2011 Sep;20(9):2594-605. doi: 10.1109/TIP.2011.2126586. Epub 2011 Mar 14.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种从复杂视频场景中检测和提取叠加文本的新方法。

A new approach for overlay text detection and extraction from complex video scene.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献