• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于期望最大化算法的混合监督场景文本检测

Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm.

作者信息

Zhao Mengbiao, Feng Wei, Yin Fei, Zhang Xu-Yao, Liu Cheng-Lin

出版信息

IEEE Trans Image Process. 2022;31:5513-5528. doi: 10.1109/TIP.2022.3197987. Epub 2022 Aug 22.

DOI:10.1109/TIP.2022.3197987
PMID:35976822
Abstract

Scene text detection is an important and challenging task in computer vision. For detecting arbitrarily-shaped texts, most existing methods require heavy data labeling efforts to produce polygon-level text region labels for supervised training. In order to reduce the cost in data labeling, we study mixed-supervised arbitrarily-shaped text detection by combining various weak supervision forms (e.g., image-level tags, coarse, loose and tight bounding boxes), which are far easier to annotate. Whereas the existing weakly-supervised learning methods (such as multiple instance learning) do not promote full object coverage, to approximate the performance of fully-supervised detection, we propose an Expectation-Maximization (EM) based mixed-supervised learning framework to train scene text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data. The polygon-level labels are treated as latent variables and recovered from the weak labels by the EM algorithm. A new contour-based scene text detector is also proposed to facilitate the use of weak labels in our mixed-supervised learning framework. Extensive experiments on six scene text benchmarks show that (1) using only 10% strongly annotated data and 90% weakly annotated data, our method yields comparable performance to that of fully supervised methods, (2) with 100% strongly annotated data, our method achieves state-of-the-art performance on five scene text benchmarks (CTW1500, Total-Text, ICDAR-ArT, MSRA-TD500, and C-SVT), and competitive results on the ICDAR2015 Dataset. We will make our weakly annotated datasets publicly available.

摘要

场景文本检测是计算机视觉中一项重要且具有挑战性的任务。对于检测任意形状的文本,大多数现有方法需要大量的数据标注工作来生成用于监督训练的多边形级文本区域标签。为了降低数据标注成本,我们通过结合各种弱监督形式(如图像级标签、粗糙、宽松和紧密边界框)来研究混合监督的任意形状文本检测,这些形式的标注要容易得多。然而,现有的弱监督学习方法(如多实例学习)并不能促进对整个对象的覆盖,为了接近完全监督检测的性能,我们提出了一种基于期望最大化(EM)的混合监督学习框架,仅使用少量多边形级标注数据与大量弱标注数据相结合来训练场景文本检测器。多边形级标签被视为潜在变量,并通过EM算法从弱标签中恢复。还提出了一种新的基于轮廓的场景文本检测器,以方便在我们的混合监督学习框架中使用弱标签。在六个场景文本基准测试上进行的大量实验表明:(1)仅使用10%的强标注数据和90%的弱标注数据,我们的方法产生的性能与完全监督方法相当;(2)使用100%的强标注数据时,我们的方法在五个场景文本基准测试(CTW1500、Total-Text、ICDAR-ArT、MSRA-TD500和C-SVT)上达到了当前最优性能,在ICDAR2015数据集上也取得了有竞争力的结果。我们将公开我们的弱标注数据集。

相似文献

1
Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm.基于期望最大化算法的混合监督场景文本检测
IEEE Trans Image Process. 2022;31:5513-5528. doi: 10.1109/TIP.2022.3197987. Epub 2022 Aug 22.
2
Arbitrarily Shaped Scene Text Detection with a Mask Tightness Text Detector.使用掩码紧密度文本检测器的任意形状场景文本检测
IEEE Trans Image Process. 2019 Nov 26. doi: 10.1109/TIP.2019.2954218.
3
MaskMitosis: a deep learning framework for fully supervised, weakly supervised, and unsupervised mitosis detection in histopathology images.MaskMitosis:一种深度学习框架,用于在组织病理学图像中进行全监督、弱监督和无监督的有丝分裂检测。
Med Biol Eng Comput. 2020 Jul;58(7):1603-1623. doi: 10.1007/s11517-020-02175-z. Epub 2020 May 22.
4
TextField: Learning a Deep Direction Field for Irregular Scene Text Detection.文本字段:学习用于不规则场景文本检测的深度方向场。
IEEE Trans Image Process. 2019 Nov;28(11):5566-5579. doi: 10.1109/TIP.2019.2900589. Epub 2019 Feb 21.
5
Weakly Supervised Deep Nuclei Segmentation With Sparsely Annotated Bounding Boxes for DNA Image Cytometry.基于稀疏标注边界框的弱监督深度细胞核分割用于DNA图像细胞计数法
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):785-795. doi: 10.1109/TCBB.2021.3138189. Epub 2023 Feb 3.
6
R-YOLO: A Real-Time Text Detector for Natural Scenes with Arbitrary Rotation.R-YOLO:一种用于任意旋转自然场景的实时文本检测器。
Sensors (Basel). 2021 Jan 28;21(3):888. doi: 10.3390/s21030888.
7
Semi-supervised training using cooperative labeling of weakly annotated data for nodule detection in chest CT.基于弱标注数据的协同标注的半监督训练在胸部 CT 结节检测中的应用。
Med Phys. 2023 Jul;50(7):4255-4268. doi: 10.1002/mp.16219. Epub 2023 Jan 27.
8
Arbitrary Shape Text Detection via Segmentation With Probability Maps.基于概率图分割的任意形状文本检测。
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):2736-2750. doi: 10.1109/TPAMI.2022.3176122. Epub 2023 Feb 3.
9
Boundary TextSpotter: Toward Arbitrary-Shaped Scene Text Spotting.边界文本检测:迈向任意形状场景文本检测
IEEE Trans Image Process. 2022;31:6200-6212. doi: 10.1109/TIP.2022.3206615. Epub 2022 Sep 28.
10
HGR-Net: Hierarchical Graph Reasoning Network for Arbitrary Shape Scene Text Detection.HGR-Net:用于任意形状场景文本检测的分层图推理网络。
IEEE Trans Image Process. 2023;32:4142-4155. doi: 10.1109/TIP.2023.3294822. Epub 2023 Jul 20.