Wang Yuan, Pang Huaxin, Qin Ying, Wei Shikui, Zhao Yao
School of Computer Science and Technology, Beijing Jiaotong University, Beijing, 100044, China; Visual Intellgence +X International Cooperation Joint Laboratory of MOE, Beijing, 100044, China.
Department of Computer and Information Technology, Beijing Jiaotong University, Beijing, 100044, China; National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, 100044, China.
Neural Netw. 2025 Aug;188:107464. doi: 10.1016/j.neunet.2025.107464. Epub 2025 Apr 12.
Instance-dependent noise (IDN) widely exists in real-world datasets, seriously hindering the effective application of deep neural networks. In contrast to class-dependent noise, IDN is influenced not solely by the class but also by the intrinsic features of the instance. Current methods for addressing IDN typically involve estimating the instance-dependent noise transition matrix (IDNT). However, these approaches often either assume a specific form for the IDNT or rely on anchor points, which can result in significant estimation errors. To tackle this issue, we propose a method that makes no assumptions about the form of IDNT or the need for anchor points. Specifically, by computing similarity scores between each instance's feature representation and the label representations, the instance-label confusion matrix (ILCM) captures the relationships between global instance features and different categories, providing valuable insights into the degree of noise. We then adaptively combine the noisy class posteriors from network predictions (for noisy data) or given labels (for clean data) with the ILCM, weighted by the degree of noise, to enhance the accuracy of IDNT estimation. Finally, the obtained IDNT adjusts the loss function, resulting in a more robust classifier. Comprehensive experiments comparing our method with state-of-the-art approaches on synthetic IDN datasets (F-MNIST, SVHN, CIFAR-10, CIFAR-100) and a real-world noisy dataset (Clothing1M) demonstrate the superiority and effectiveness of our approach.
实例依赖噪声(IDN)广泛存在于现实世界的数据集中,严重阻碍了深度神经网络的有效应用。与类别依赖噪声不同,IDN不仅受类别影响,还受实例的内在特征影响。当前解决IDN的方法通常涉及估计实例依赖噪声转移矩阵(IDNT)。然而,这些方法往往要么对IDNT假设特定形式,要么依赖锚点,这可能导致显著的估计误差。为了解决这个问题,我们提出了一种方法,该方法不对IDNT的形式或锚点的需求做任何假设。具体来说,通过计算每个实例的特征表示与标签表示之间的相似性分数,实例-标签混淆矩阵(ILCM)捕捉全局实例特征与不同类别之间的关系,为噪声程度提供有价值的见解。然后,我们将来自网络预测(针对噪声数据)或给定标签(针对干净数据)的噪声类后验与ILCM进行自适应组合,并根据噪声程度进行加权,以提高IDNT估计的准确性。最后,获得的IDNT调整损失函数,从而得到一个更强大的分类器。在合成IDN数据集(F-MNIST、SVHN、CIFAR-10、CIFAR-100)和真实世界噪声数据集(Clothing1M)上,将我们的方法与现有最先进方法进行的综合实验证明了我们方法的优越性和有效性。