Faculty of Computer Science and Engineering, University Ss. Cyril and Methodius, 1000 Skopje, North Macedonia.
Instituto de Telecomunicações, Universidade da Beira Interior, 6200-001 Covilhã, Portugal.
Sensors (Basel). 2022 Feb 18;22(4):1599. doi: 10.3390/s22041599.
Large-scale labeled datasets are generally necessary for successfully training a deep neural network in the computer vision domain. In order to avoid the costly and tedious work of manually annotating image datasets, self-supervised learning methods have been proposed to learn general visual features automatically. In this paper, we first focus on image colorization with generative adversarial networks (GANs) because of their ability to generate the most realistic colorization results. Then, via transfer learning, we use this as a proxy task for visual understanding. Particularly, we propose to use conditional GANs (cGANs) for image colorization and transfer the gained knowledge to two other downstream tasks, namely, multilabel image classification and semantic segmentation. This is the first time that GANs have been used for self-supervised feature learning through image colorization. Through extensive experiments with the COCO and Pascal datasets, we show an increase of 5% for the classification task and 2.5% for the segmentation task. This demonstrates that image colorization with conditional GANs can boost other downstream tasks' performance without the need for manual annotation.
大规模的标记数据集通常是成功训练计算机视觉领域的深度神经网络所必需的。为了避免手动注释图像数据集的昂贵和繁琐的工作,已经提出了自监督学习方法来自动学习通用的视觉特征。在本文中,我们首先专注于使用生成对抗网络(GANs)进行图像着色,因为它们能够生成最逼真的着色效果。然后,通过迁移学习,我们将其用作视觉理解的代理任务。特别地,我们建议使用条件生成对抗网络(cGANs)进行图像着色,并将获得的知识转移到另外两个下游任务,即多标签图像分类和语义分割。这是首次通过图像着色使用 GANs 进行自监督特征学习。通过使用 COCO 和 Pascal 数据集进行广泛的实验,我们在分类任务中提高了 5%,在分割任务中提高了 2.5%。这表明,使用条件 GANs 进行图像着色可以在不需要手动注释的情况下提高其他下游任务的性能。