Suppr超能文献

黑箱无监督域适应中基于阈值的噪声标签利用

Threshold-based exploitation of noisy label in black-box unsupervised domain adaptation.

作者信息

Xu Huiwen, Lee Jaeri, Kang U

机构信息

Data Mining Lab, Seoul National University, Seoul, Republic of Korea.

出版信息

PLoS One. 2025 May 12;20(5):e0321987. doi: 10.1371/journal.pone.0321987. eCollection 2025.

Abstract

How can we perform unsupervised domain adaptation when transferring a black-box source model to a target domain? Black-box Unsupervised Domain Adaptation focuses on transferring the labels derived from a pre-trained black-box source model to an unlabeled target domain. The problem setting is motivated by privacy concerns associated with accessing and utilizing source data or source model parameters. Recent studies typically train the target model by mimicking the labels derived from the black-box source model, which often contain noise due to domain gaps between the source and the target. Directly exploiting such noisy labels or disregarding them may lead to a decrease in the model's performance. We propose Threshold-Based Exploitation of Noisy Predictions (TEN), a method to accurately learn the target model with noisy labels in Black-box Unsupervised Domain Adaptation. To ensure the preservation of information from the black-box source model, we employ a threshold-based approach to distinguish between clean labels and noisy labels, thereby allowing the transfer of high-confidence knowledge from both labels. We utilize a flexible thresholding approach to adjust the threshold for each class, thereby obtaining an adequate amount of clean data for hard-to-learn classes. We also exploit knowledge distillation for clean data and negative learning for noisy labels to extract high-confidence information. Extensive experiments show that TEN outperforms baselines with an accuracy improvement of up to 9.49%.

摘要

在将黑盒源模型转移到目标域时,我们如何进行无监督域适应?黑盒无监督域适应专注于将从预训练黑盒源模型导出的标签转移到无标签的目标域。该问题设置的动机源于与访问和使用源数据或源模型参数相关的隐私问题。最近的研究通常通过模仿从黑盒源模型导出的标签来训练目标模型,由于源域和目标域之间的域差距,这些标签往往包含噪声。直接利用此类噪声标签或忽略它们可能会导致模型性能下降。我们提出了基于阈值的噪声预测利用方法(TEN),这是一种在黑盒无监督域适应中利用噪声标签准确学习目标模型的方法。为了确保保留黑盒源模型的信息,我们采用基于阈值的方法来区分干净标签和噪声标签,从而允许从这两种标签中转移高置信度知识。我们使用灵活的阈值方法为每个类别调整阈值,从而为难以学习的类别获得足够数量的干净数据。我们还利用干净数据的知识蒸馏和噪声标签的负学习来提取高置信度信息。大量实验表明,TEN的性能优于基线,准确率提高了9.49%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f820/12068613/a9a0bb482be6/pone.0321987.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验