用于语义分割中无源域适应的非配对图像到图像翻译。

Unpaired image to image translation for source free domain adaptation in semantic segmentation.

作者信息

An Juan, He Zhaoshui, Guo Jing, Wang Jing, Lin Zhijie, Liang Hao, Tan Ji, Su Wenqing

机构信息

School of Automation, Guangdong University of Technology, Guangzhou, 510006, China.

Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangzhou, 510006, China.

出版信息

Sci Rep. 2025 Jul 2;15(1):23318. doi: 10.1038/s41598-025-05648-z.

DOI:10.1038/s41598-025-05648-z

PMID:40603970

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12223052/

Abstract

Source-free domain adaptation (SFDA) assumes that source data are inaccessible during domain adaptation. Current SFDA methods commonly utilize source-trained models to generate pseudolabels for unlabelled target data. SFDA for semantic segmentation has become topical and focuses on challenges such as pseudolabel noise, model overfitting, and class imbalance. To address these issues, this paper proposes an unpaired image-to-image (UITI) learning framework. Specifically, we select valid pseudolabels on the basis of image-style consistency via two source-trained discriminators, to reduce pseudolabel noise caused by domain discrepancies. To prevent the source model from overfitting on the target domain, we generate augmented data as supplementary samples for the target data. These synthetic samples maintain feature-level knowledge of source data while preserving domain-invariant structural characteristics of target data. Furthermore, these synthetic samples foster rare-class patches and key-region patches. Additionally, we propose a class alignment loss to balance the appearance frequency of classes, and a region alignment loss to preserve both global semantics and local details. Extensive experiments on two widely used benchmarks, GTA5 → Cityscapes and SYNTHIA → Cityscapes, show that the proposed method achieves state-of-the-art mIoU scores of 58.3% and 61.3%, respectively.

摘要

无源域适应（SFDA）假设在域适应期间源数据不可用。当前的SFDA方法通常利用在源数据上训练的模型为未标记的目标数据生成伪标签。用于语义分割的SFDA已成为热门话题，并聚焦于伪标签噪声、模型过拟合和类别不平衡等挑战。为了解决这些问题，本文提出了一种非配对图像到图像（UITI）学习框架。具体而言，我们通过两个在源数据上训练的判别器，基于图像风格一致性选择有效的伪标签，以减少由域差异引起的伪标签噪声。为防止源模型在目标域上过拟合，我们生成增强数据作为目标数据的补充样本。这些合成样本保留了源数据的特征级知识，同时保留了目标数据的域不变结构特征。此外，这些合成样本促进了稀有类补丁和关键区域补丁的生成。此外，我们提出了一种类别对齐损失来平衡类别的出现频率，以及一种区域对齐损失来保留全局语义和局部细节。在两个广泛使用的基准测试GTA5→Cityscapes和SYNTHIA→Cityscapes上进行的大量实验表明，所提出的方法分别实现了58.3%和61.3%的当前最优平均交并比得分。