用于大规模半监督目标检测的视觉与语义知识迁移

Visual and Semantic Knowledge Transfer for Large Scale Semi-Supervised Object Detection.

作者信息

Tang Yuxing, Wang Josiah, Wang Xiaofang, Gao Boyang, Dellandrea Emmanuel, Gaizauskas Robert, Chen Liming

出版信息

IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):3045-3058. doi: 10.1109/TPAMI.2017.2771779. Epub 2017 Nov 9.

DOI:10.1109/TPAMI.2017.2771779

Abstract

Deep CNN-based object detection systems have achieved remarkable success on several large-scale object detection benchmarks. However, training such detectors requires a large number of labeled bounding boxes, which are more difficult to obtain than image-level annotations. Previous work addresses this issue by transforming image-level classifiers into object detectors. This is done by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We improve this previous work by incorporating knowledge about object similarities from visual and semantic domains during the transfer process. The intuition behind our proposed method is that visually and semantically similar categories should exhibit more common transferable properties than dissimilar categories, e.g. a better detector would result by transforming the differences between a dog classifier and a dog detector onto the cat class, than would by transforming from the violin class. Experimental results on the challenging ILSVRC2013 detection dataset demonstrate that each of our proposed object similarity based knowledge transfer methods outperforms the baseline methods. We found strong evidence that visual similarity and semantic relatedness are complementary for the task, and when combined notably improve detection, achieving state-of-the-art detection performance in a semi-supervised setting.

摘要

基于深度卷积神经网络（CNN）的目标检测系统在多个大规模目标检测基准测试中取得了显著成功。然而，训练这样的检测器需要大量带标注的边界框，而获取这些边界框比获取图像级别的标注更加困难。先前的工作通过将图像级分类器转换为目标检测器来解决这个问题。具体做法是，利用同时具有图像级和边界框标注的类别来对两者之间的差异进行建模，并将此信息进行传递，以便将分类器转换为针对没有边界框标注的类别的检测器。我们通过在传递过程中纳入来自视觉和语义领域的关于目标相似性的知识来改进先前的工作。我们提出的方法背后的直觉是，在视觉和语义上相似的类别应该比不相似的类别表现出更多共同的可传递属性，例如，将狗分类器和狗检测器之间的差异转换到猫类别上，会比从小提琴类别进行转换得到更好的检测器。在具有挑战性的ILSVRC2013检测数据集上的实验结果表明，我们提出的基于目标相似性的每种知识传递方法都优于基线方法。我们发现有力证据表明，视觉相似性和语义相关性在该任务中是互补的，并且两者结合时能显著提高检测效果，在半监督设置下实现了当前最优的检测性能。

相似文献

Visual and Semantic Knowledge Transfer for Large Scale Semi-Supervised Object Detection.用于大规模半监督目标检测的视觉与语义知识迁移

IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):3045-3058. doi: 10.1109/TPAMI.2017.2771779. Epub 2017 Nov 9.

AutoBD: Automated Bi-Level Description for Scalable Fine-Grained Visual Categorization.AutoBD：用于可扩展细粒度视觉分类的自动双层次描述。

IEEE Trans Image Process. 2018;27(1):10-23. doi: 10.1109/TIP.2017.2751960.

Mixed Supervised Object Detection with Robust Objectness Transfer.基于稳健目标性迁移的混合监督目标检测

IEEE Trans Pattern Anal Mach Intell. 2019 Mar;41(3):639-653. doi: 10.1109/TPAMI.2018.2810288. Epub 2018 Feb 28.

Coarse-to-Fine Semantic Segmentation From Image-Level Labels.从图像级标签进行粗到细的语义分割。

IEEE Trans Image Process. 2020;29:225-236. doi: 10.1109/TIP.2019.2926748. Epub 2019 Jul 12.

Weakly Supervised Large Scale Object Localization with Multiple Instance Learning and Bag Splitting.基于多示例学习和 Bag Splitting 的弱监督大规模目标定位。

IEEE Trans Pattern Anal Mach Intell. 2016 Feb;38(2):405-16. doi: 10.1109/TPAMI.2015.2456908.

Cyclic Self-Training With Proposal Weight Modulation for Cross-Supervised Object Detection.用于交叉监督目标检测的带提议权重调制的循环自训练

IEEE Trans Image Process. 2023;32:1992-2002. doi: 10.1109/TIP.2023.3261752. Epub 2023 Apr 4.

Weakly Supervised Object Detection via Object-Specific Pixel Gradient.基于特定对象像素梯度的弱监督目标检测

IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):5960-5970. doi: 10.1109/TNNLS.2018.2816021. Epub 2018 Apr 9.

Incorporating Network Built-in Priors in Weakly-Supervised Semantic Segmentation.在弱监督语义分割中融入网络内置先验信息。

IEEE Trans Pattern Anal Mach Intell. 2018 Jun;40(6):1382-1396. doi: 10.1109/TPAMI.2017.2713785. Epub 2017 Jun 8.

Learning to Segment Human by Watching YouTube.通过观看 YouTube 学习分割人体。

IEEE Trans Pattern Anal Mach Intell. 2017 Jul;39(7):1462-1468. doi: 10.1109/TPAMI.2016.2598340. Epub 2016 Aug 5.

Data-Driven Detection of Prominent Objects.基于数据驱动的显著目标检测。

IEEE Trans Pattern Anal Mach Intell. 2016 Oct;38(10):1969-82. doi: 10.1109/TPAMI.2015.2509988. Epub 2015 Dec 17.

引用本文的文献

CHEER: Rich Model Helps Poor Model via Knowledge Infusion.欢呼：丰富模型通过知识注入帮助贫困模型。

IEEE Trans Knowl Data Eng. 2022 Feb;34(2):531-543. doi: 10.1109/tkde.2020.2989405. Epub 2020 Apr 22.

The relative contributions of visual and semantic information in the neural representation of object categories.视觉信息和语义信息在物体类别神经表示中的相对贡献。

Brain Behav. 2019 Oct;9(10):e01373. doi: 10.1002/brb3.1373. Epub 2019 Sep 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于大规模半监督目标检测的视觉与语义知识迁移

Visual and Semantic Knowledge Transfer for Large Scale Semi-Supervised Object Detection.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献