Suppr超能文献

用于大规模半监督目标检测的视觉与语义知识迁移

Visual and Semantic Knowledge Transfer for Large Scale Semi-Supervised Object Detection.

作者信息

Tang Yuxing, Wang Josiah, Wang Xiaofang, Gao Boyang, Dellandrea Emmanuel, Gaizauskas Robert, Chen Liming

出版信息

IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):3045-3058. doi: 10.1109/TPAMI.2017.2771779. Epub 2017 Nov 9.

Abstract

Deep CNN-based object detection systems have achieved remarkable success on several large-scale object detection benchmarks. However, training such detectors requires a large number of labeled bounding boxes, which are more difficult to obtain than image-level annotations. Previous work addresses this issue by transforming image-level classifiers into object detectors. This is done by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We improve this previous work by incorporating knowledge about object similarities from visual and semantic domains during the transfer process. The intuition behind our proposed method is that visually and semantically similar categories should exhibit more common transferable properties than dissimilar categories, e.g. a better detector would result by transforming the differences between a dog classifier and a dog detector onto the cat class, than would by transforming from the violin class. Experimental results on the challenging ILSVRC2013 detection dataset demonstrate that each of our proposed object similarity based knowledge transfer methods outperforms the baseline methods. We found strong evidence that visual similarity and semantic relatedness are complementary for the task, and when combined notably improve detection, achieving state-of-the-art detection performance in a semi-supervised setting.

摘要

基于深度卷积神经网络(CNN)的目标检测系统在多个大规模目标检测基准测试中取得了显著成功。然而,训练这样的检测器需要大量带标注的边界框,而获取这些边界框比获取图像级别的标注更加困难。先前的工作通过将图像级分类器转换为目标检测器来解决这个问题。具体做法是,利用同时具有图像级和边界框标注的类别来对两者之间的差异进行建模,并将此信息进行传递,以便将分类器转换为针对没有边界框标注的类别的检测器。我们通过在传递过程中纳入来自视觉和语义领域的关于目标相似性的知识来改进先前的工作。我们提出的方法背后的直觉是,在视觉和语义上相似的类别应该比不相似的类别表现出更多共同的可传递属性,例如,将狗分类器和狗检测器之间的差异转换到猫类别上,会比从小提琴类别进行转换得到更好的检测器。在具有挑战性的ILSVRC2013检测数据集上的实验结果表明,我们提出的基于目标相似性的每种知识传递方法都优于基线方法。我们发现有力证据表明,视觉相似性和语义相关性在该任务中是互补的,并且两者结合时能显著提高检测效果,在半监督设置下实现了当前最优的检测性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验