• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于注意力机制的卷积神经网络场景文本检测方法

Text-Attentional Convolutional Neural Network for Scene Text Detection.

出版信息

IEEE Trans Image Process. 2016 Jun;25(6):2529-41. doi: 10.1109/TIP.2016.2547588.

DOI:10.1109/TIP.2016.2547588
PMID:27093723
Abstract

Recent deep learning models have demonstrated strong capabilities for classifying text and non-text components in natural images. They extract a high-level feature globally computed from a whole image component (patch), where the cluttered background information may dominate true text features in the deep representation. This leads to less discriminative power and poorer robustness. In this paper, we present a new system for scene text detection by proposing a novel text-attentional convolutional neural network (Text-CNN) that particularly focuses on extracting text-related regions and features from the image components. We develop a new learning mechanism to train the Text-CNN with multi-level and rich supervised information, including text region mask, character label, and binary text/non-text information. The rich supervision information enables the Text-CNN with a strong capability for discriminating ambiguous texts, and also increases its robustness against complicated background components. The training process is formulated as a multi-task learning problem, where low-level supervised information greatly facilitates the main task of text/non-text classification. In addition, a powerful low-level detector called contrast-enhancement maximally stable extremal regions (MSERs) is developed, which extends the widely used MSERs by enhancing intensity contrast between text patterns and background. This allows it to detect highly challenging text patterns, resulting in a higher recall. Our approach achieved promising results on the ICDAR 2013 data set, with an F-measure of 0.82, substantially improving the state-of-the-art results.

摘要

最近的深度学习模型在对自然图像中的文本和非文本成分进行分类方面表现出了强大的能力。它们从整个图像组件(补丁)全局计算提取高级特征,其中杂乱的背景信息可能会主导深层表示中的真实文本特征。这导致了较差的辨别能力和较差的鲁棒性。在本文中,我们通过提出一种新的文本注意卷积神经网络(Text-CNN)来提出一种新的系统来进行场景文本检测,该网络特别关注从图像组件中提取与文本相关的区域和特征。我们开发了一种新的学习机制,使用多级和丰富的监督信息来训练 Text-CNN,包括文本区域掩模、字符标签和二进制文本/非文本信息。丰富的监督信息使 Text-CNN 具有区分模糊文本的强大能力,并且还提高了其对复杂背景组件的鲁棒性。训练过程被公式化为多任务学习问题,其中低级监督信息极大地促进了文本/非文本分类的主要任务。此外,还开发了一种强大的低级检测器,称为对比度增强最大稳定极值区域(MSERs),它通过增强文本模式和背景之间的强度对比度来扩展广泛使用的 MSERs。这使得它能够检测极具挑战性的文本模式,从而提高了召回率。我们的方法在 ICDAR 2013 数据集上取得了有希望的结果,F 度量为 0.82,大大提高了现有技术的结果。

相似文献

1
Text-Attentional Convolutional Neural Network for Scene Text Detection.基于注意力机制的卷积神经网络场景文本检测方法
IEEE Trans Image Process. 2016 Jun;25(6):2529-41. doi: 10.1109/TIP.2016.2547588.
2
Scene text detection via extremal region based double threshold convolutional network classification.基于极值区域的双阈值卷积网络分类的场景文本检测
PLoS One. 2017 Aug 18;12(8):e0182227. doi: 10.1371/journal.pone.0182227. eCollection 2017.
3
Robust Text Detection in Natural Scene Images.自然场景图像中的鲁棒文本检测。
IEEE Trans Pattern Anal Mach Intell. 2014 May;36(5):970-83. doi: 10.1109/TPAMI.2013.182.
4
Co-trained convolutional neural networks for automated detection of prostate cancer in multi-parametric MRI.基于多参数 MRI 的协同训练卷积神经网络在前列腺癌自动检测中的应用
Med Image Anal. 2017 Dec;42:212-227. doi: 10.1016/j.media.2017.08.006. Epub 2017 Aug 24.
5
A novel biomedical image indexing and retrieval system via deep preference learning.一种基于深度偏好学习的新型生物医学图像索引和检索系统。
Comput Methods Programs Biomed. 2018 May;158:53-69. doi: 10.1016/j.cmpb.2018.02.003. Epub 2018 Feb 6.
6
Locally Supervised Deep Hybrid Model for Scene Recognition.用于场景识别的局部监督深度混合模型
IEEE Trans Image Process. 2017 Feb;26(2):808-820. doi: 10.1109/TIP.2016.2629443. Epub 2016 Nov 16.
7
A hybrid approach to detect and localize texts in natural scene images.一种用于检测和定位自然场景图像中文本的混合方法。
IEEE Trans Image Process. 2011 Mar;20(3):800-13. doi: 10.1109/TIP.2010.2070803. Epub 2010 Sep 2.
8
TextField: Learning a Deep Direction Field for Irregular Scene Text Detection.文本字段:学习用于不规则场景文本检测的深度方向场。
IEEE Trans Image Process. 2019 Nov;28(11):5566-5579. doi: 10.1109/TIP.2019.2900589. Epub 2019 Feb 21.
9
Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks.基于示例卷积神经网络的判别式无监督特征学习。
IEEE Trans Pattern Anal Mach Intell. 2016 Sep;38(9):1734-47. doi: 10.1109/TPAMI.2015.2496141. Epub 2015 Oct 29.
10
Clinical text classification with rule-based features and knowledge-guided convolutional neural networks.基于规则特征和知识引导卷积神经网络的临床文本分类。
BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):71. doi: 10.1186/s12911-019-0781-4.

引用本文的文献

1
Social media crisis communication and public engagement during COVID-19 analyzing public health and news media organizations' tweeting strategies.新冠疫情期间的社交媒体危机沟通与公众参与:分析公共卫生和新闻媒体组织的推文策略
Sci Rep. 2025 May 24;15(1):18082. doi: 10.1038/s41598-025-90759-w.
2
An efficient breast cancer classification model using bilateral filtering and fuzzy convolutional neural network.利用双边滤波和模糊卷积神经网络的高效乳腺癌分类模型。
Sci Rep. 2024 Mar 15;14(1):6290. doi: 10.1038/s41598-024-56698-8.
3
Chronic disease diagnosis model based on convolutional neural network and ensemble learning method.
基于卷积神经网络和集成学习方法的慢性病诊断模型
Digit Health. 2023 Aug 31;9:20552076231198643. doi: 10.1177/20552076231198643. eCollection 2023 Jan-Dec.
4
A fine-tuned YOLOv5 deep learning approach for real-time house number detection.一种用于实时门牌号检测的微调YOLOv5深度学习方法。
PeerJ Comput Sci. 2023 Jul 3;9:e1453. doi: 10.7717/peerj-cs.1453. eCollection 2023.
5
Association Mining of Near Misses in Hydropower Engineering Construction Based on Convolutional Neural Network Text Classification.基于卷积神经网络文本分类的水电工程施工险兆关联挖掘。
Comput Intell Neurosci. 2022 Jan 3;2022:4851615. doi: 10.1155/2022/4851615. eCollection 2022.
6
LPI-HyADBS: a hybrid framework for lncRNA-protein interaction prediction integrating feature selection and classification.LPI-HyADBS:一种集成特征选择和分类的 lncRNA-蛋白质相互作用预测的混合框架。
BMC Bioinformatics. 2021 Nov 26;22(1):568. doi: 10.1186/s12859-021-04485-x.
7
Urdu text in natural scene images: a new dataset and preliminary text detection.自然场景图像中的乌尔都语文本:一个新数据集及初步文本检测
PeerJ Comput Sci. 2021 Sep 16;7:e717. doi: 10.7717/peerj-cs.717. eCollection 2021.
8
MSF-Net: Multi-Scale Feature Learning Network for Classification of Surface Defects of Multifarious Sizes.MSF-Net:用于多尺寸表面缺陷分类的多尺度特征学习网络。
Sensors (Basel). 2021 Jul 29;21(15):5125. doi: 10.3390/s21155125.
9
Lung Infection Segmentation for COVID-19 Pneumonia Based on a Cascade Convolutional Network from CT Images.基于 CT 图像的级联卷积网络的 COVID-19 肺炎肺部感染分割。
Biomed Res Int. 2021 Apr 15;2021:5544742. doi: 10.1155/2021/5544742. eCollection 2021.
10
Cascaded Segmentation-Detection Networks for Word-Level Text Spotting.用于单词级文本定位的级联分割-检测网络。
Proc Int Conf Doc Anal Recognit. 2017 Nov;2017:1275-1282. doi: 10.1109/ICDAR.2017.210. Epub 2018 Jan 29.