Delft University of Technology, Faculty of Civil Engineering and Geosciences, Department of Water Management, Stevinweg 1, 2628 CN Delft, The Netherlands.
Noria Sustainable Innovators, Schieweg 13, 2627 AN Delft, The Netherlands.
Water Res. 2024 Nov 15;266:122405. doi: 10.1016/j.watres.2024.122405. Epub 2024 Sep 11.
Researchers and practitioners have extensively utilized supervised Deep Learning methods to quantify floating litter in rivers and canals. These methods require the availability of large amount of labeled data for training. The labeling work is expensive and laborious, resulting in small open datasets available in the field compared to the comprehensive datasets for computer vision, e.g., ImageNet. Fine-tuning models pre-trained on these larger datasets helps improve litter detection performances and reduces data requirements. Yet, the effectiveness of using features learned from generic datasets is limited in large-scale monitoring, where automated detection must adapt across different locations, environmental conditions, and sensor settings. To address this issue, we propose a two-stage semi-supervised learning method to detect floating litter based on the Swapping Assignments between multiple Views of the same image (SwAV). SwAV is a self-supervised learning approach that learns the underlying feature representation from unlabeled data. In the first stage, we used SwAV to pre-train a ResNet50 backbone architecture on about 100k unlabeled images. In the second stage, we added new layers to the pre-trained ResNet50 to create a Faster R-CNN architecture, and fine-tuned it with a limited number of labeled images (≈1.8k images with 2.6k annotated litter items). We developed and validated our semi-supervised floating litter detection methodology for images collected in canals and waterways of Delft (the Netherlands) and Jakarta (Indonesia). We tested for out-of-domain generalization performances in a zero-shot fashion using additional data from Ho Chi Minh City (Vietnam), Amsterdam and Groningen (the Netherlands). We benchmarked our results against the same Faster R-CNN architecture trained via supervised learning alone by fine-tuning ImageNet pre-trained weights. The findings indicate that the semi-supervised learning method matches or surpasses the supervised learning benchmark when tested on new images from the same training locations. We measured better performances when little data (≈200 images with about 300 annotated litter items) is available for fine-tuning and with respect to reducing false positive predictions. More importantly, the proposed approach demonstrates clear superiority for generalization on the unseen locations, with improvements in average precision of up to 12.7%. We attribute this superior performance to the more effective high-level feature extraction from SwAV pre-training from relevant unlabeled images. Our findings highlight a promising direction to leverage semi-supervised learning for developing foundational models, which have revolutionized artificial intelligence applications in most fields. By scaling our proposed approach with more data and compute, we can make significant strides in monitoring to address the global challenge of litter pollution in water bodies.
研究人员和从业者已经广泛使用监督深度学习方法来量化河流和运河中的漂浮垃圾。这些方法需要大量标记数据进行训练。标记工作既昂贵又费力,因此与计算机视觉等领域的综合数据集相比,可用的公开数据集很小,例如 ImageNet。在这些更大的数据集上预训练的模型进行微调有助于提高垃圾检测性能并减少数据需求。然而,在大规模监测中,使用从通用数据集学习到的特征的效果有限,因为在大规模监测中,自动检测必须适应不同的位置、环境条件和传感器设置。为了解决这个问题,我们提出了一种基于 SwAV 的两阶段半监督学习方法来检测基于同一图像的多个视图之间的交换分配(SwAV)的漂浮垃圾。SwAV 是一种自监督学习方法,它可以从无标签数据中学习到基本的特征表示。在第一阶段,我们使用 SwAV 对大约 100k 张无标签图像进行预训练 ResNet50 骨干架构。在第二阶段,我们在预训练的 ResNet50 上添加新的层来创建 Faster R-CNN 架构,并使用有限数量的标记图像(约 1.8k 张图像,2.6k 个标记垃圾)进行微调。我们在鹿特丹(荷兰)和雅加达(印度尼西亚)的运河和水道中收集的图像上开发和验证了我们的半监督漂浮垃圾检测方法。我们以零样本方式使用来自胡志明市(越南)、阿姆斯特丹和格罗宁根(荷兰)的额外数据进行了域外泛化性能测试。我们将结果与通过仅使用监督学习进行微调的相同 Faster R-CNN 架构进行了基准测试,该架构使用 ImageNet 预训练权重进行了微调。研究结果表明,当在来自同一训练地点的新图像上进行测试时,半监督学习方法与监督学习基准匹配或超过了基准。当可用于微调的数据较少(约 200 张图像,约 300 个标记垃圾)且减少假阳性预测时,我们的性能会更好。更重要的是,所提出的方法在看不见的位置上具有明显的优势,平均精度提高了 12.7%。我们将这种卓越的性能归因于从相关的未标记图像进行 SwAV 预训练中提取更有效的高级特征。我们的研究结果强调了利用半监督学习开发基础模型的有前途的方向,这为大多数领域的人工智能应用带来了革命性的变化。通过使用更多的数据和计算来扩展我们的方法,我们可以在监测方面取得重大进展,以解决水体中垃圾污染这一全球性挑战。
Water Res. 2024-11-15
BMC Med Inform Decis Mak. 2024-5-16
IEEE Trans Med Imaging. 2023-12
Water Res. 2023-3-1
Med Biol Eng Comput. 2022-4