Suppr超能文献

通过测量图像的等变性和等效性来理解图像表示

Understanding Image Representations by Measuring Their Equivariance and Equivalence.

作者信息

Lenc Karel, Vedaldi Andrea

机构信息

Department of Engineering Science, University of Oxford, Oxford, UK.

出版信息

Int J Comput Vis. 2019;127(5):456-476. doi: 10.1007/s11263-018-1098-y. Epub 2018 May 18.

Abstract

Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited. Aimed at filling this gap, we investigate two key mathematical properties of representations: equivariance and equivalence. Equivariance studies how transformations of the input image are encoded by the representation, invariance being a special case where a transformation has no effect. Equivalence studies whether two representations, for example two different parameterizations of a CNN, two different layers, or two different CNN architectures, share the same visual information or not. A number of methods to establish these properties empirically are proposed, including introducing transformation and stitching layers in CNNs. These methods are then applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved and how various CNN architectures differ. We identify several predictors of geometric and architectural compatibility, including the spatial resolution of the representation and the complexity and depth of the models. While the focus of the paper is theoretical, direct applications to structured-output regression are demonstrated too.

摘要

尽管诸如方向梯度直方图和深度卷积神经网络(CNN)等图像表示方法很重要,但我们对它们的理论理解仍然有限。为了填补这一空白,我们研究了表示方法的两个关键数学特性:等变性和等价性。等变性研究输入图像的变换如何由表示方法进行编码,不变性是变换没有影响的一种特殊情况。等价性研究两个表示方法,例如CNN的两种不同参数化、两个不同层或两种不同的CNN架构,是否共享相同的视觉信息。我们提出了一些通过实证建立这些特性的方法,包括在CNN中引入变换层和拼接层。然后将这些方法应用于流行的表示方法,以揭示其结构中具有洞察力的方面,包括阐明在CNN的哪些层实现了特定的几何不变性以及各种CNN架构有何不同。我们确定了几个几何和架构兼容性的预测因素,包括表示方法的空间分辨率以及模型的复杂性和深度。虽然本文的重点是理论性的,但也展示了其在结构化输出回归中的直接应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0e4/6510825/976476a19915/11263_2018_1098_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验