基于小样本场景的遥感场景分类的自监督学习。

Self-supervised learning for remote sensing scene classification under the few shot scenario.

机构信息

Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, 2454, Riyadh, 11451, Saudi Arabia.

出版信息

Sci Rep. 2023 Jan 9;13(1):433. doi: 10.1038/s41598-022-27313-5.

Scene classification is a crucial research problem in remote sensing (RS) that has attracted many researchers recently. It has many challenges due to multiple issues, such as: the complexity of remote sensing scenes, the classes overlapping (as a scene may contain objects that belong to foreign classes), and the difficulty of gaining sufficient labeled scenes. Deep learning (DL) solutions and in particular convolutional neural networks (CNN) are now state-of-the-art solution in RS scene classification; however, CNN models need huge amounts of annotated data, which can be costly and time-consuming. On the other hand, it is relatively easy to acquire large amounts of unlabeled images. Recently, Self-Supervised Learning (SSL) is proposed as a method that can learn from unlabeled images, potentially reducing the need for labeling. In this work, we propose a deep SSL method, called RS-FewShotSSL, for RS scene classification under the few shot scenario when we only have a few (less than 20) labeled scenes per class. Under this scenario, typical DL solutions that fine-tune CNN models, pre-trained on the ImageNet dataset, fail dramatically. In the SSL paradigm, a DL model is pre-trained from scratch during the pretext task using the large amounts of unlabeled scenes. Then, during the main or the so-called downstream task, the model is fine-tuned on the labeled scenes. Our proposed RS-FewShotSSL solution is composed of an online network and a target network both using the EfficientNet-B3 CNN model as a feature encoder backbone. During the pretext task, RS-FewShotSSL learns discriminative features from the unlabeled images using cross-view contrastive learning. Different views are generated from each image using geometric transformations and passed to the online and target networks. Then, the whole model is optimized by minimizing the cross-view distance between the online and target networks. To address the problem of limited computation resources available to us, our proposed method uses a novel DL architecture that can be trained using both high-resolution and low-resolution images. During the pretext task, RS-FewShotSSL is trained using low-resolution images, thereby, allowing for larger batch sizes which significantly boosts the performance of the proposed pipeline on the task of RS classification. In the downstream task, the target network is discarded, and the online network is fine-tuned using the few labeled shots or scenes. Here, we use smaller batches of both high-resolution and low-resolution images. This architecture allows RS-FewshotSSL to benefit from both large batch sizes and full image sizes, thereby learning from the large amounts of unlabeled data in an effective way. We tested RS-FewShotSSL on three RS public datasets, and it demonstrated a significant improvement compared to other state-of-the-art methods such as: SimCLR, MoCo, BYOL and IDSSL.

场景分类是遥感（RS）中一个重要的研究问题，最近吸引了许多研究人员。由于多种问题，如遥感场景的复杂性、类重叠（因为一个场景可能包含属于其他类的对象）以及获取足够的有标签场景的困难，它具有许多挑战。深度学习（DL）解决方案，特别是卷积神经网络（CNN），现在是 RS 场景分类的最新解决方案；然而，CNN 模型需要大量的标注数据，这可能既昂贵又耗时。另一方面，获取大量未标记的图像相对容易。最近，自监督学习（SSL）被提出作为一种可以从未标记图像中学习的方法，可能减少对标记的需求。在这项工作中，我们提出了一种深度 SSL 方法，称为 RS-FewShotSSL，用于每个类只有少数（少于 20 个）标记场景的少样本场景下的 RS 场景分类。在这种情况下，微调 CNN 模型的典型 DL 解决方案，预训练在 ImageNet 数据集上，会出现明显的失败。在 SSL 范例中，使用大量未标记的场景，从头开始对 DL 模型进行预训练，作为预训练任务。然后，在主要任务或所谓的下游任务中，对标记的场景进行微调。我们提出的 RS-FewShotSSL 解决方案由一个在线网络和一个目标网络组成，它们都使用 EfficientNet-B3 CNN 模型作为特征编码器骨干。在预训练任务中，RS-FewShotSSL 使用跨视图对比学习从未标记的图像中学习有区别的特征。从每个图像生成不同的视图，使用几何变换并传递到在线和目标网络。然后，通过最小化在线和目标网络之间的跨视图距离来优化整个模型。为了解决我们可用的计算资源有限的问题，我们提出的方法使用了一种新颖的 DL 架构，可以使用高分辨率和低分辨率图像进行训练。在预训练任务中，RS-FewShotSSL 使用低分辨率图像进行训练，从而允许更大的批量大小，这大大提高了我们的管道在 RS 分类任务上的性能。在下游任务中，丢弃目标网络，并使用少数标记的镜头或场景对在线网络进行微调。在这里，我们使用高分辨率和低分辨率图像的较小批量。该架构使 RS-FewshotSSL 能够受益于大批量和全图像大小，从而以有效的方式从大量未标记的数据中学习。我们在三个 RS 公共数据集上测试了 RS-FewShotSSL，与其他最先进的方法（如 SimCLR、MoCo、BYOL 和 IDSSL）相比，它表现出了显著的改进。

Self-supervised learning for remote sensing scene classification under the few shot scenario.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献