School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, China.
School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China.
Neural Netw. 2024 Apr;172:106137. doi: 10.1016/j.neunet.2024.106137. Epub 2024 Jan 29.
Learning with Noisy Labels (LNL) methods have been widely studied in recent years, which aims to improve the performance of Deep Neural Networks (DNNs) when the training dataset contains incorrectly annotated labels. Popular existing LNL methods rely on semantic features extracted by the DNN to detect and mitigate label noise. However, these extracted features are often spurious and contain unstable correlations with the label across different environments (domains), which can occasionally lead to incorrect prediction and compromise the efficacy of LNL methods. To mitigate this insufficiency, we propose Invariant Feature based Label Correction (IFLC), which reduces spurious features and accurately utilizes the learned invariant features that contain stable correlation to correct label noise. To the best of our knowledge, this is the first attempt to mitigate the issue of spurious features for LNL methods. IFLC consists of two critical processes: The Label Disturbing (LD) process and the Representation Decorrelation (RD) process. The LD process aims to encourage DNN to attain stable performance across different environments, thus reducing the captured spurious features. The RD process strengthens independence between each dimension of the representation vector, thus enabling accurate utilization of the learned invariant features for label correction. We then utilize robust linear regression for the feature representation to conduct label correction. We evaluated the effectiveness of our proposed method and compared it with state-of-the-art (sota) LNL methods on four benchmark datasets, CIFAR-10, CIFAR-100, Animal-10N, and Clothing1M. The experimental results show that our proposed method achieved comparable or even better performance than the existing sota methods. The source codes are available at https://github.com/yangbo1973/IFLC.
近年来,带有噪声标签(LNL)的学习方法得到了广泛的研究,旨在提高深度神经网络(DNN)在训练数据集包含错误标注标签时的性能。现有的流行 LNL 方法依赖于 DNN 提取的语义特征来检测和减轻标签噪声。然而,这些提取的特征通常是虚假的,并且与不同环境(域)中的标签之间的相关性不稳定,这偶尔会导致错误的预测,并影响 LNL 方法的效果。为了缓解这一不足,我们提出了基于不变特征的标签校正(IFLC),它可以减少虚假特征,并准确地利用学习到的包含稳定相关性的不变特征来校正标签噪声。据我们所知,这是第一次尝试减轻 LNL 方法中虚假特征的问题。IFLC 由两个关键过程组成:标签干扰(LD)过程和表示去相关(RD)过程。LD 过程旨在鼓励 DNN 在不同环境中获得稳定的性能,从而减少捕获的虚假特征。RD 过程加强了表示向量每个维度之间的独立性,从而能够准确地利用学习到的不变特征进行标签校正。然后,我们使用稳健的线性回归进行特征表示,以进行标签校正。我们在四个基准数据集 CIFAR-10、CIFAR-100、Animal-10N 和 Clothing1M 上评估了我们提出的方法的有效性,并将其与最先进的( sota)LNL 方法进行了比较。实验结果表明,我们提出的方法的性能与现有的 sota 方法相当,甚至更好。源代码可在 https://github.com/yangbo1973/IFLC 上获得。