Gao Yun, Fu Junhu, Wang Yuanyuan, Guo Yi
School of Information Science and Technology, Fudan University, Shanghai, 200433, China.
Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention of Shanghai, Shanghai, 200433, China.
Vis Comput Ind Biomed Art. 2024 May 6;7(1):10. doi: 10.1186/s42492-024-00162-x.
Learning with noisy labels aims to train neural networks with noisy labels. Current models handle instance-independent label noise (IIN) well; however, they fall short with real-world noise. In medical image classification, atypical samples frequently receive incorrect labels, rendering instance-dependent label noise (IDN) an accurate representation of real-world scenarios. However, the current IDN approaches fail to consider the typicality of samples, which hampers their ability to address real-world label noise effectively. To alleviate the issues, we introduce typicality- and instance-dependent label noise (TIDN) to simulate real-world noise and establish a TIDN-combating framework to combat label noise. Specifically, we use the sample's distance to decision boundaries in the feature space to represent typicality. The TIDN is then generated according to typicality. We establish a TIDN-attention module to combat label noise and learn the transition matrix from latent ground truth to the observed noisy labels. A recursive algorithm that enables the network to make correct predictions with corrections from the learned transition matrix is proposed. Our experiments demonstrate that the TIDN simulates real-world noise more closely than the existing IIN and IDN. Furthermore, the TIDN-combating framework demonstrates superior classification performance when training with simulated TIDN and actual real-world noise.
带有噪声标签的学习旨在训练带有噪声标签的神经网络。当前模型能够很好地处理实例独立标签噪声(IIN);然而,它们在处理现实世界噪声方面存在不足。在医学图像分类中,非典型样本经常被赋予错误标签,使得实例相关标签噪声(IDN)成为现实世界场景的准确表征。然而,当前的IDN方法未能考虑样本的典型性,这阻碍了它们有效处理现实世界标签噪声的能力。为了缓解这些问题,我们引入典型性和实例相关标签噪声(TIDN)来模拟现实世界噪声,并建立一个对抗TIDN的框架来对抗标签噪声。具体来说,我们使用样本在特征空间中到决策边界的距离来表示典型性。然后根据典型性生成TIDN。我们建立了一个TIDN注意力模块来对抗标签噪声,并学习从潜在真实标签到观察到的噪声标签的转换矩阵。提出了一种递归算法,使网络能够利用从学习到的转换矩阵得到的校正进行正确预测。我们的实验表明,TIDN比现有的IIN和IDN更能紧密地模拟现实世界噪声。此外,在使用模拟的TIDN和实际现实世界噪声进行训练时,对抗TIDN的框架表现出卓越的分类性能。