Zhang Shuming, Ren Xueting, Qiang Yan, Zhao Juanjuan, Qiao Ying, Yue Huajie
College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, China.
School of Software, North University of China, Taiyuan, China.
J Xray Sci Technol. 2025 Jul;33(4):665-682. doi: 10.1177/08953996251319652. Epub 2025 Mar 25.
BackgroundThe precise pneumoconiosis staging suffers from progressive pair label noise (PPLN) in chest X-ray datasets, because adjacent stages are confused due to unidentifialble and diffuse opacities in the lung fields. As deep neural networks are employed to aid the disease staging, the performance is degraded under such label noise.ObjectiveThis study improves the effectiveness of pneumoconiosis staging by mitigating the impact of PPLN through network architecture refinement and sample selection mechanism adjustment.MethodsWe propose a novel multi-branch architecture that incorporates the dual-threshold sample selection. Several auxiliary branches are integrated in a two-phase module to learn and predict the . A novel difference-based metric is introduced to iteratively obtained the instance-specific thresholds as a complementary criterion of dynamic sample selection. All the samples are finally partitioned into and sets according to dual-threshold criteria and treated differently by loss functions with penalty terms.ResultsCompared with the state-of-the-art, the proposed method obtains the best metrics (accuracy: 90.92%, precision: 84.25%, sensitivity: 81.11%, F1-score: 82.06%, and AUC: 94.64%) under real-world PPLN, and is less sensitive to the rise of synthetic PPLN rate. An ablation study validates the respective contributions of critical modules and demonstrates how variations of essential hyperparameters affect model performance.ConclusionsThe proposed method achieves substantial effectiveness and robustness against PPLN in pneumoconiosis dataset, and can further assist physicians in diagnosing the disease with a higher accuracy and confidence.
背景
在胸部X线数据集中,精确的尘肺病分期存在渐进性成对标签噪声(PPLN),因为由于肺野中无法识别的弥漫性opacity,相邻阶段会混淆。随着深度神经网络被用于辅助疾病分期,在这种标签噪声下性能会下降。
目的
本研究通过优化网络架构和调整样本选择机制来减轻PPLN的影响,从而提高尘肺病分期的有效性。
方法
我们提出了一种结合双阈值样本选择的新型多分支架构。几个辅助分支集成在一个两阶段模块中以学习和预测。引入了一种基于差异的新度量,以迭代获得特定实例的阈值,作为动态样本选择的补充标准。所有样本最终根据双阈值标准分为和集,并通过带有惩罚项的损失函数进行不同处理。
结果
与现有技术相比,所提出的方法在实际PPLN下获得了最佳指标(准确率:90.92%,精确率:84.25%,灵敏度:81.11%,F1分数:82.06%,AUC:94.64%),并且对合成PPLN率的上升不太敏感。消融研究验证了关键模块的各自贡献,并展示了关键超参数的变化如何影响模型性能。
结论
所提出的方法在尘肺病数据集中针对PPLN实现了显著的有效性和鲁棒性,并且可以进一步帮助医生以更高的准确性和置信度诊断疾病。