Suppr超能文献

通过更精确的分区校正和渐进式硬增强学习来对抗医学标签噪声。

Combating Medical Label Noise through more precise partition-correction and progressive hard-enhanced learning.

作者信息

Zhang Sanyan, Chu Surong, Qiang Yan, Zhao Juanjuan, Wang Yan, Wei Xiao

机构信息

Imaging & Intelligence Lab, Taiyuan University of Technology, China.

Imaging & Intelligence Lab, Taiyuan University of Technology, China; School of Software, North University of China, Taiyuan, China.

出版信息

Comput Methods Programs Biomed. 2025 Jun;265:108734. doi: 10.1016/j.cmpb.2025.108734. Epub 2025 Mar 29.

Abstract

BACKGROUND AND OBJECTIVE

Computer-aided diagnosis systems based on deep neural networks heavily rely on datasets with high-quality labels. However, manual annotation for lesion diagnosis relies on image features, often requiring professional experience and complex image analysis process. This inevitably introduces noisy labels, which can misguide the training of classification models. Our goal is to design an effective method to address the challenges posed by label noise in medical images.

METHODS

we propose a novel noise-tolerant medical image classification framework consisting of two phases: fore-training correction and progressive hard-sample enhanced learning. In the first phase, we design a dual-branch sample partition detection scheme that effectively classifies each instance into one of three subsets: clean, hard, or noisy. Simultaneously, we propose a hard-sample label refinement strategy based on class prototypes with confidence-perception weighting and an effective joint correction method for noisy samples, enabling the acquisition of higher-quality training data. In the second phase, we design a progressive hard-sample reinforcement learning method to enhance the model's ability to learn discriminative feature representations. This approach accounts for sample difficulty and mitigates the effects of label noise in medical datasets.

RESULTS

Our framework achieves an accuracy of 82.39% on the pneumoconiosis dataset collected by our laboratory. On a five-class skin disease dataset with six different levels of label noise (0, 0.05, 0.1, 0.2, 0.3, and 0.4), the average accuracy over the last ten epochs reaches 88.51%, 86.64%, 85.02%, 83.01%, 81.95%, 77.89%, respectively; For binary polyp classification under noise rates of 0.2, 0.3, and 0.4, the average accuracy over the last ten epochs is 97.90%, 93.77%, 89.33%, respectively.

CONCLUSIONS

The effectiveness of our proposed framework is demonstrated through its performance on three challenging datasets with both real and synthetic noise. Experimental results further demonstrate the robustness of our method across varying noise rates.

摘要

背景与目的

基于深度神经网络的计算机辅助诊断系统严重依赖高质量标注的数据集。然而,病变诊断的人工标注依赖图像特征,通常需要专业经验和复杂的图像分析过程。这不可避免地会引入噪声标签,从而误导分类模型的训练。我们的目标是设计一种有效的方法来应对医学图像中标签噪声带来的挑战。

方法

我们提出了一种新颖的耐噪声医学图像分类框架,该框架由两个阶段组成:预训练校正和渐进式硬样本增强学习。在第一阶段,我们设计了一种双分支样本划分检测方案,有效地将每个实例分类为三个子集之一:干净、困难或噪声。同时,我们提出了一种基于具有置信度感知加权的类原型的硬样本标签细化策略以及一种针对噪声样本的有效联合校正方法,从而能够获取更高质量的训练数据。在第二阶段,我们设计了一种渐进式硬样本强化学习方法,以增强模型学习判别性特征表示的能力。这种方法考虑了样本难度,并减轻了医学数据集中标签噪声的影响。

结果

我们的框架在我们实验室收集的尘肺病数据集上达到了82.39%的准确率。在一个具有六种不同标签噪声水平(0、0.05、0.1、0.2、0.3和0.4)的五类皮肤病数据集上,最后十个epoch的平均准确率分别达到88.51%、86.64%、85.02%、83.01%、81.95%、77.89%;对于噪声率为0.2、0.3和0.4的二元息肉分类,最后十个epoch的平均准确率分别为97.90%、93.77%、89.33%。

结论

我们提出的框架的有效性通过其在三个具有真实和合成噪声的具有挑战性的数据集上的性能得到了证明。实验结果进一步证明了我们的方法在不同噪声率下的鲁棒性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验