Suppr超能文献

使用以公共为中心的定位网络进行少样本通用目标推理

Few-Shot Common-Object Reasoning Using Common-Centric Localization Network.

作者信息

Zhu Linchao, Fan Hehe, Luo Yawei, Xu Mingliang, Yang Yi

出版信息

IEEE Trans Image Process. 2021;30:4253-4262. doi: 10.1109/TIP.2021.3070733. Epub 2021 Apr 14.

Abstract

In the few-shot common-localization task, given few support images without bounding box annotations at each episode, the goal is to localize the common object in the query image of unseen categories. The few-shot common-localization task involves common object reasoning from the given images, predicting the spatial locations of the object with different shapes, sizes, and orientations. In this work, we propose a common-centric localization (CCL) network for few-shot common-localization. The motivation of our common-centric localization network is to learn the common object features by dynamic feature relation reasoning via a graph convolutional network with conditional feature aggregation. First, we propose a local common object region generation pipeline to reduce background noises due to feature misalignment. Each support image predicts more accurate object spatial locations by replacing the query with the images in the support set. Second, we introduce a graph convolutional network with dynamic feature transformation to enforce the common object reasoning. To enhance the discriminability during feature matching and enable a better generalization in unseen scenarios, we leverage a conditional feature encoding function to alter visual features according to the input query adaptively. Third, we introduce a common-centric relation structure to model the correlation between the common features and the query image feature. The generated common features guide the query image feature towards a more common object-related representation. We evaluate our common-centric localization network on four datasets, i.e., CL-VOC-07, CL-VOC-12, CL-COCO, CL-VID. We obtain significant improvements compared to state-of-the-art. Our quantitative results confirm the effectiveness of our network.

摘要

在少样本通用定位任务中,在每个情节中给定少量没有边界框注释的支持图像,目标是在未见类别的查询图像中定位通用对象。少样本通用定位任务涉及从给定图像中进行通用对象推理,预测具有不同形状、大小和方向的对象的空间位置。在这项工作中,我们提出了一种用于少样本通用定位的以通用为中心的定位(CCL)网络。我们的以通用为中心的定位网络的动机是通过具有条件特征聚合的图卷积网络,通过动态特征关系推理来学习通用对象特征。首先,我们提出了一种局部通用对象区域生成管道,以减少由于特征未对齐引起的背景噪声。每个支持图像通过用支持集中的图像替换查询来预测更准确的对象空间位置。其次,我们引入了一个具有动态特征变换的图卷积网络来加强通用对象推理。为了在特征匹配期间提高可辨别性并在未见场景中实现更好的泛化,我们利用条件特征编码函数根据输入查询自适应地改变视觉特征。第三,我们引入了一种以通用为中心的关系结构来对通用特征与查询图像特征之间的相关性进行建模。生成的通用特征将查询图像特征引导向更与通用对象相关的表示。我们在四个数据集上评估了我们的以通用为中心的定位网络,即CL-VOC-07、CL-VOC-12、CL-COCO、CL-VID。与现有技术相比,我们取得了显著的改进。我们的定量结果证实了我们网络的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验