Zhan Qianshan, Zeng Xiao-Jun, Wang Qian
Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9PL, United Kingdom.
Luca Healthcare R&D, Shanghai, 200000, China.
Neural Netw. 2025 May;185:107161. doi: 10.1016/j.neunet.2025.107161. Epub 2025 Jan 17.
Due to data privacy and storage concerns, Source-Free Unsupervised Domain Adaptation (SFUDA) focuses on improving an unlabelled target domain by leveraging a pre-trained source model without access to source data. While existing studies attempt to train target models by mitigating biases induced by noisy pseudo labels, they often lack theoretical guarantees for fully reducing biases and have predominantly addressed classification tasks rather than regression ones. To address these gaps, our analysis delves into the generalisation error bound of the target model, aiming to understand the intrinsic limitations of pseudo-label-based SFUDA methods. Theoretical results reveal that biases influencing generalisation error extend beyond the commonly highlighted label inconsistency bias, which denotes the mismatch between pseudo labels and ground truths, and the feature-label mapping bias, which represents the difference between the proxy target regressor and the real target regressor. Equally significant is the feature misalignment bias, indicating the misalignment between the estimated and real target feature distributions. This factor is frequently neglected or not explicitly addressed in current studies. Additionally, the label inconsistency bias can be unbounded in regression due to the continuous label space, further complicating SFUDA for regression tasks. Guided by these theoretical insights, we propose a Bias-Reduced Regression (BRR) method for SFUDA in regression. This method incorporates Feature Distribution Alignment (FDA) to reduce the feature misalignment bias, Hybrid Reliability Evaluation (HRE) to reduce the feature-label mapping bias and pseudo label updating to mitigate the label inconsistency bias. Experiments demonstrate the superior performance of the proposed BRR, and the effectiveness of FDA and HRE in reducing biases for regression tasks in SFUDA.
由于数据隐私和存储问题,无源无监督域适应(SFUDA)专注于通过利用预训练的源模型来改进未标记的目标域,而无需访问源数据。虽然现有研究试图通过减轻噪声伪标签引起的偏差来训练目标模型,但它们往往缺乏完全减少偏差的理论保证,并且主要解决的是分类任务而非回归任务。为了填补这些空白,我们的分析深入研究了目标模型的泛化误差界,旨在了解基于伪标签的SFUDA方法的内在局限性。理论结果表明,影响泛化误差的偏差不仅包括通常突出的标签不一致偏差(即伪标签与真实标签之间的不匹配)和特征-标签映射偏差(即代理目标回归器与真实目标回归器之间的差异)。同样重要的是特征不对准偏差,它表示估计的目标特征分布与真实目标特征分布之间的不对准。这个因素在当前研究中经常被忽视或未得到明确解决。此外,由于标签空间是连续的,回归中的标签不一致偏差可能是无界的,这使得SFUDA在回归任务中更加复杂。基于这些理论见解,我们提出了一种用于回归中SFUDA的偏差减少回归(BRR)方法。该方法结合了特征分布对齐(FDA)以减少特征不对准偏差、混合可靠性评估(HRE)以减少特征-标签映射偏差以及伪标签更新以减轻标签不一致偏差。实验证明了所提出的BRR的优越性能,以及FDA和HRE在减少SFUDA中回归任务偏差方面的有效性。