Suppr超能文献

通过自动编码变换学习广义变换等变表示。

Learning Generalized Transformation Equivariant Representations Via AutoEncoding Transformations.

作者信息

Qi Guo-Jun, Zhang Liheng, Lin Feng, Wang Xiao

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Apr;44(4):2045-2057. doi: 10.1109/TPAMI.2020.3029801. Epub 2022 Mar 4.

Abstract

Transformation equivariant representations (TERs) aim to capture the intrinsic visual structures that equivary to various transformations by expanding the notion of translation equivariance underlying the success of convolutional neural networks (CNNs). For this purpose, we present both deterministic AutoEncoding Transformations (AET) and probabilistic AutoEncoding Variational Transformations (AVT) models to learn visual representations from generic groups of transformations. While the AET is trained by directly decoding the transformations from the learned representations, the AVT is trained by maximizing the joint mutual information between the learned representation and transformations. This results in generalized TERs (GTERs) equivariant against transformations in a more general fashion by capturing complex patterns of visual structures beyond the conventional linear equivariance under a transformation group. The presented approach can be extended to (semi-)supervised models by jointly maximizing the mutual information of the learned representation with both labels and transformations. Experiments demonstrate the proposed models outperform the state-of-the-art models in both unsupervised and (semi-)supervised tasks. Moreover, we show that the unsupervised representation can even surpass the fully supervised representation pretrained on ImageNet when they are fine-tuned for the object detection task.

摘要

变换等变表示(TERs)旨在通过扩展卷积神经网络(CNNs)成功背后的平移等变性概念,来捕捉对各种变换具有等变性的内在视觉结构。为此,我们提出了确定性自动编码变换(AET)和概率自动编码变分变换(AVT)模型,以从通用变换组中学习视觉表示。虽然AET通过直接从学习到的表示中解码变换来进行训练,但AVT通过最大化学习到的表示与变换之间的联合互信息来进行训练。这通过在变换组下捕捉超越传统线性等变性的复杂视觉结构模式,以更通用的方式产生了对变换具有等变性的广义TERs(GTERs)。通过联合最大化学习到的表示与标签和变换的互信息,所提出的方法可以扩展到(半)监督模型。实验表明,所提出的模型在无监督和(半)监督任务中均优于现有模型。此外,我们表明,当对无监督表示进行目标检测任务微调时,它们甚至可以超越在ImageNet上预训练的全监督表示。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验