• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从角落提取文本:一种新颖的视频中检测文本和标题的方法。

Text from corners: a novel approach to detect text and caption in videos.

机构信息

School of Electronic, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.

出版信息

IEEE Trans Image Process. 2011 Mar;20(3):790-9. doi: 10.1109/TIP.2010.2068553. Epub 2010 Aug 19.

DOI:10.1109/TIP.2010.2068553
PMID:20729170
Abstract

Detecting text and caption from videos is important and in great demand for video retrieval, annotation, indexing, and content analysis. In this paper, we present a corner based approach to detect text and caption from videos. This approach is inspired by the observation that there exist dense and orderly presences of corner points in characters, especially in text and caption. We use several discriminative features to describe the text regions formed by the corner points. The usage of these features is in a flexible manner, thus, can be adapted to different applications. Language independence is an important advantage of the proposed method. Moreover, based upon the text features, we further develop a novel algorithm to detect moving captions in videos. In the algorithm, the motion features, extracted by optical flow, are combined with text features to detect the moving caption patterns. The decision tree is adopted to learn the classification criteria. Experiments conducted on a large volume of real video shots demonstrate the efficiency and robustness of our proposed approaches and the real-world system. Our text and caption detection system was recently highlighted in a worldwide multimedia retrieval competition, Star Challenge, by achieving the superior performance with the top ranking.

摘要

从视频中检测文本和字幕对于视频检索、注释、索引和内容分析非常重要且需求巨大。在本文中,我们提出了一种基于角点的方法来从视频中检测文本和字幕。这种方法的灵感来源于这样一种观察,即在字符中存在密集且有序的角点存在,特别是在文本和字幕中。我们使用了几个有区别的特征来描述由角点形成的文本区域。这些特征的使用非常灵活,因此可以适应不同的应用。所提出的方法的一个重要优点是语言独立性。此外,基于文本特征,我们进一步开发了一种新颖的算法来检测视频中的移动字幕。在该算法中,通过光流提取的运动特征与文本特征相结合,以检测移动字幕模式。决策树被用来学习分类标准。在大量真实视频镜头上进行的实验证明了我们提出的方法的效率和鲁棒性,以及实际系统。我们的文本和字幕检测系统最近在全球多媒体检索竞赛 Star Challenge 中得到了突出展示,以顶级排名获得了卓越的性能。

相似文献

1
Text from corners: a novel approach to detect text and caption in videos.从角落提取文本:一种新颖的视频中检测文本和标题的方法。
IEEE Trans Image Process. 2011 Mar;20(3):790-9. doi: 10.1109/TIP.2010.2068553. Epub 2010 Aug 19.
2
A new approach for overlay text detection and extraction from complex video scene.一种从复杂视频场景中检测和提取叠加文本的新方法。
IEEE Trans Image Process. 2009 Feb;18(2):401-11. doi: 10.1109/TIP.2008.2008225. Epub 2008 Dec 16.
3
Modeling semantic aspects for cross-media image indexing.跨媒体图像索引的语义方面建模
IEEE Trans Pattern Anal Mach Intell. 2007 Oct;29(10):1802-17. doi: 10.1109/TPAMI.2007.1097.
4
A Unified Framework for Tracking Based Text Detection and Recognition from Web Videos.基于 Web 视频的跟踪式文本检测与识别的统一框架。
IEEE Trans Pattern Anal Mach Intell. 2018 Mar;40(3):542-554. doi: 10.1109/TPAMI.2017.2692763. Epub 2017 Apr 12.
5
Using language to learn structured appearance models for image annotation.用语言学习图像标注的结构化外观模型。
IEEE Trans Pattern Anal Mach Intell. 2010 Jan;32(1):148-64. doi: 10.1109/TPAMI.2008.283.
6
Efficient visual search of videos cast as text retrieval.将视频高效可视搜索转换为文本检索。
IEEE Trans Pattern Anal Mach Intell. 2009 Apr;31(4):591-606. doi: 10.1109/TPAMI.2008.111.
7
Correlative linear neighborhood propagation for video annotation.用于视频标注的相关线性邻域传播
IEEE Trans Syst Man Cybern B Cybern. 2009 Apr;39(2):409-16. doi: 10.1109/TSMCB.2008.2006045. Epub 2008 Dec 16.
8
The semantic pathfinder: using an authoring metaphor for generic multimedia indexing.语义路径查找器:使用创作隐喻进行通用多媒体索引编制。
IEEE Trans Pattern Anal Mach Intell. 2006 Oct;28(10):1678-89. doi: 10.1109/TPAMI.2006.212.
9
Cap4Video++: Enhancing Video Understanding With Auxiliary Captions.
IEEE Trans Pattern Anal Mach Intell. 2025 Jul;47(7):5223-5237. doi: 10.1109/TPAMI.2024.3410329.
10
Parameters in television captioning for deaf and hard-of-hearing adults: effects of caption rate versus text reduction on comprehension.针对成年聋人和重听人士的电视字幕参数:字幕速度与文本精简对理解的影响。
J Deaf Stud Deaf Educ. 2008 Summer;13(3):391-404. doi: 10.1093/deafed/enn003. Epub 2008 Mar 27.

引用本文的文献

1
Rotation-invariant features for multi-oriented text detection in natural images.自然图像中多朝向文本检测的旋转不变特征。
PLoS One. 2013 Aug 5;8(8):e70173. doi: 10.1371/journal.pone.0070173. Print 2013.
2
Text Extraction from Scene Images by Character Appearance and Structure Modeling.通过字符外观和结构建模从场景图像中提取文本
Comput Vis Image Underst. 2013 Feb 1;117(2):182-194. doi: 10.1016/j.cviu.2012.11.002.