Suppr超能文献

金字塔语义对应网络

Pyramidal Semantic Correspondence Networks.

作者信息

Jeon Sangryul, Kim Seungryong, Min Dongbo, Sohn Kwanghoon

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9102-9118. doi: 10.1109/TPAMI.2021.3123679. Epub 2022 Nov 7.

Abstract

This paper presents a deep architecture, called pyramidal semantic correspondence networks (PSCNet), that estimates locally-varying affine transformation fields across semantically similar images. To deal with large appearance and shape variations that commonly exist among different instances within the same object category, we leverage a pyramidal model where the affine transformation fields are progressively estimated in a coarse-to-fine manner so that the smoothness constraint is naturally imposed. Different from the previous methods which directly estimate global or local deformations, our method first starts to estimate the transformation from an entire image and then progressively increases the degree of freedom of the transformation by dividing coarse cell into finer ones. To this end, we propose two spatial pyramid models by dividing an image in a form of quad-tree rectangles or into multiple semantic elements of an object. Additionally, to overcome the limitation of insufficient training data, a novel weakly-supervised training scheme is introduced that generates progressively evolving supervisions through the spatial pyramid models by leveraging a correspondence consistency across image pairs. Extensive experimental results on various benchmarks including TSS, Proposal Flow-WILLOW, Proposal Flow-PASCAL, Caltech-101, and SPair-71k demonstrate that the proposed method outperforms the lastest methods for dense semantic correspondence.

摘要

本文提出了一种深度架构,称为金字塔语义对应网络(PSCNet),该架构可估计跨语义相似图像的局部变化仿射变换场。为了处理同一对象类别中不同实例之间通常存在的较大外观和形状变化,我们利用一种金字塔模型,其中仿射变换场以粗到细的方式逐步估计,从而自然地施加平滑约束。与之前直接估计全局或局部变形的方法不同,我们的方法首先从估计整个图像的变换开始,然后通过将粗单元格划分为更细的单元格来逐步增加变换的自由度。为此,我们通过以四叉树矩形形式划分图像或划分为对象的多个语义元素来提出两种空间金字塔模型。此外,为了克服训练数据不足的限制,引入了一种新颖的弱监督训练方案,该方案通过利用图像对之间的对应一致性,通过空间金字塔模型生成逐步演变的监督。在包括TSS、Proposal Flow-WILLOW、Proposal Flow-PASCAL、Caltech-101和SPair-71k在内的各种基准上的大量实验结果表明,所提出的方法优于最新的密集语义对应方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验