基于深度神经网络的自监督视觉特征学习：综述

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4037-4058. doi: 10.1109/TPAMI.2020.2992393. Epub 2021 Oct 1.

DOI:10.1109/TPAMI.2020.2992393

Abstract

Large-scale labeled data are generally required to train deep neural networks in order to obtain better performance in visual feature learning from images or videos for computer vision applications. To avoid extensive cost of collecting and annotating large-scale datasets, as a subset of unsupervised learning methods, self-supervised learning methods are proposed to learn general image and video features from large-scale unlabeled data without using any human-annotated labels. This paper provides an extensive review of deep learning-based self-supervised general visual feature learning methods from images or videos. First, the motivation, general pipeline, and terminologies of this field are described. Then the common deep neural network architectures that used for self-supervised learning are summarized. Next, the schema and evaluation metrics of self-supervised learning methods are reviewed followed by the commonly used datasets for images, videos, audios, and 3D data, as well as the existing self-supervised visual feature learning methods. Finally, quantitative performance comparisons of the reviewed methods on benchmark datasets are summarized and discussed for both image and video feature learning. At last, this paper is concluded and lists a set of promising future directions for self-supervised visual feature learning.

摘要

大规模标记数据通常用于训练深度神经网络，以便在计算机视觉应用中从图像或视频中学习视觉特征时获得更好的性能。为了避免收集和注释大规模数据集的广泛成本，作为无监督学习方法的一个子集，提出了自监督学习方法，以便从大规模未标记数据中学习通用图像和视频特征，而无需使用任何人工注释标签。本文对基于深度学习的自监督通用视觉特征学习方法进行了广泛的回顾，从图像或视频开始。首先，描述了该领域的动机、一般流程和术语。然后总结了用于自监督学习的常见深度神经网络架构。接下来，回顾了自监督学习方法的方案和评估指标，以及常用的图像、视频、音频和 3D 数据集以及现有的自监督视觉特征学习方法。最后，总结并讨论了在基准数据集上对所回顾方法的定量性能比较，分别用于图像和视频特征学习。最后，本文进行了总结，并列出了一组有前途的自监督视觉特征学习的未来方向。

相似文献

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey.基于深度神经网络的自监督视觉特征学习：综述

IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4037-4058. doi: 10.1109/TPAMI.2020.2992393. Epub 2021 Oct 1.

GAN-Based Image Colorization for Self-Supervised Visual Feature Learning.基于 GAN 的图像着色用于自监督视觉特征学习。

Sensors (Basel). 2022 Feb 18;22(4):1599. doi: 10.3390/s22041599.

Exploiting Images for Video Recognition: Heterogeneous Feature Augmentation via Symmetric Adversarial Learning.利用图像进行视频识别：通过对称对抗学习实现异构特征增强

IEEE Trans Image Process. 2019 Nov;28(11):5308-5321. doi: 10.1109/TIP.2019.2917867. Epub 2019 May 24.

Semi Supervised Learning with Deep Embedded Clustering for Image Classification and Segmentation.用于图像分类和分割的深度嵌入聚类半监督学习

IEEE Access. 2019;7:11093-11104. doi: 10.1109/ACCESS.2019.2891970. Epub 2019 Jan 9.

Self-supervised representation learning using feature pyramid siamese networks for colorectal polyp detection.基于特征金字塔孪生网络的自监督表示学习在结直肠息肉检测中的应用。

Sci Rep. 2023 Dec 8;13(1):21655. doi: 10.1038/s41598-023-49057-6.

Detecting floating litter in freshwater bodies with semi-supervised deep learning.利用半监督深度学习技术检测淡水体中的漂浮垃圾。

Water Res. 2024 Nov 15;266:122405. doi: 10.1016/j.watres.2024.122405. Epub 2024 Sep 11.

Semi-supervised deep learning of brain tissue segmentation.半监督深度学习的脑组织分割。

Neural Netw. 2019 Aug;116:25-34. doi: 10.1016/j.neunet.2019.03.014. Epub 2019 Apr 1.

Self-supervised Learning: A Succinct Review.自监督学习：简要综述。

Arch Comput Methods Eng. 2023;30(4):2761-2775. doi: 10.1007/s11831-023-09884-2. Epub 2023 Jan 20.

Semi-supervised training of deep convolutional neural networks with heterogeneous data and few local annotations: An experiment on prostate histopathology image classification.基于异构数据和少量局部标注的深度卷积神经网络的半监督学习：前列腺组织病理学图像分类实验。

Med Image Anal. 2021 Oct;73:102165. doi: 10.1016/j.media.2021.102165. Epub 2021 Jul 14.

Deep self-supervised transformation learning for leukocyte classification.用于白细胞分类的深度自监督变换学习

J Biophotonics. 2023 Mar;16(3):e202200244. doi: 10.1002/jbio.202200244. Epub 2022 Dec 2.

引用本文的文献

MIFA: Metadata, Incentives, Formats and Accessibility guidelines to improve the reuse of AI datasets for bioimage analysis.MIFA：用于改善生物图像分析中人工智能数据集再利用的元数据、激励措施、格式和可访问性指南。

Nat Methods. 2025 Sep 15. doi: 10.1038/s41592-025-02835-8.

OCT-SelfNet: a self-supervised framework with multi-source datasets for generalized retinal disease detection.OCT-SelfNet：一个用于广义视网膜疾病检测的具有多源数据集的自监督框架。

Front Big Data. 2025 Jul 29;8:1609124. doi: 10.3389/fdata.2025.1609124. eCollection 2025.

Applications of Computer Vision for Infectious Keratitis: A Systematic Review.计算机视觉在感染性角膜炎中的应用：一项系统综述

Ophthalmol Sci. 2025 Jun 19;5(6):100861. doi: 10.1016/j.xops.2025.100861. eCollection 2025 Nov-Dec.

Contrastive learning-driven framework for neuron morphology classification.用于神经元形态分类的对比学习驱动框架。

Sci Rep. 2025 Jul 30;15(1):27752. doi: 10.1038/s41598-025-11842-w.

A Closer Look at Benchmarking Self-supervised Pre-training with Image Classification.深入研究基于图像分类的基准自监督预训练

Int J Comput Vis. 2025;133(8):5013-5025. doi: 10.1007/s11263-025-02402-w. Epub 2025 Apr 27.

GRANet: a graph residual attention network for gene regulatory network inference.GRANet：一种用于基因调控网络推断的图残差注意力网络。

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf349.

SFTA-Net: a self-supervised approach to detect copy-move and splicing forgery to leverage triplet loss, auxiliary loss, and spatial attention.SFTA-Net：一种利用三元组损失、辅助损失和空间注意力来检测复制移动和拼接伪造的自监督方法。

PeerJ Comput Sci. 2025 Apr 16;11:e2803. doi: 10.7717/peerj-cs.2803. eCollection 2025.

Integrating Multi-sensor Time-series Data for ALSFRS-R Clinical Scale Predictions in an ALS Patient Case Study.在一项肌萎缩侧索硬化症（ALS）患者案例研究中，整合多传感器时间序列数据用于ALS功能评定量表修订版（ALSFRS-R）临床量表预测

AMIA Annu Symp Proc. 2025 May 22;2024:788-797. eCollection 2024.

RP-DETR: end-to-end rice pests detection using a transformer.RP-DETR：使用Transformer进行端到端水稻害虫检测

Plant Methods. 2025 May 17;21(1):63. doi: 10.1186/s13007-025-01381-w.

Enhancing breast cancer detection on screening mammogram using self-supervised learning and a hybrid deep model of Swin Transformer and convolutional neural networks.使用自监督学习以及Swin Transformer和卷积神经网络的混合深度模型提高筛查乳腺钼靶片中的乳腺癌检测率。

J Med Imaging (Bellingham). 2025 Nov;12(Suppl 2):S22007. doi: 10.1117/1.JMI.12.S2.S22007. Epub 2025 May 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于深度神经网络的自监督视觉特征学习：综述

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献