Zheng Mingkai, You Shan, Wang Fei, Qian Chen, Zhang Changshui, Wang Xiaogang, Xu Chang
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8502-8516. doi: 10.1109/TPAMI.2024.3406907. Epub 2024 Nov 6.
Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved great success in learning visual representations without data annotations. However, most methods mainly focus on the instance level information (i.e., the different augmented images of the same instance should have the same feature or cluster into the same class), but there is a lack of attention on the relationships between different instances. In this paper, we introduce a novel SSL paradigm, which we term as relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Specifically, our proposed method employs sharpened distribution of pairwise similarities among different instances as relation metric, which is thus utilized to match the feature embeddings of different augmentations. To boost the performance, we argue that weak augmentations matter to represent a more reliable relation, and leverage momentum strategy for practical efficiency. The designed asymmetric predictor head and an InfoNCE warm-up strategy enhance the robustness to hyper-parameters and benefit the resulting performance. Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures, including various lightweight networks (e.g., EfficientNet and MobileNet).
自监督学习(SSL),包括主流的对比学习,在无数据标注的情况下学习视觉表征方面取得了巨大成功。然而,大多数方法主要关注实例级信息(即同一实例的不同增强图像应具有相同特征或聚类到同一类别),但对不同实例之间的关系缺乏关注。在本文中,我们引入了一种新颖的自监督学习范式,我们将其称为关系自监督学习(ReSSL)框架,该框架通过对不同实例之间的关系进行建模来学习表征。具体而言,我们提出的方法采用不同实例之间成对相似度的锐化分布作为关系度量,从而利用它来匹配不同增强的特征嵌入。为了提高性能,我们认为弱增强对于表示更可靠的关系很重要,并利用动量策略来提高实际效率。设计的非对称预测头和InfoNCE预热策略增强了对超参数的鲁棒性,并有利于最终性能。实验结果表明,我们提出的ReSSL在不同网络架构上显著优于现有方法,包括各种轻量级网络(如EfficientNet和MobileNet)。