Zhu Wenjun, Wang Xinyue, Xing Jie, Xu Xu Steven, Yuan Min
Department of Health Data Science, Anhui Medical University, Hefei, China.
The Second School of Clinical Medicine, Anhui Medical University, Hefei, China.
Quant Imaging Med Surg. 2025 Sep 1;15(9):8189-8204. doi: 10.21037/qims-2025-824. Epub 2025 Aug 12.
Lung cancer remains one of the malignant tumors with the highest global morbidity and mortality rates. Detecting pulmonary nodules in computed tomography (CT) images is essential for early lung cancer screening. However, traditional detection methods often suffer from low accuracy and efficiency, limiting their clinical effectiveness. This study aims to devise an advanced deep-learning framework capable of achieving high-precision, rapid identification of pulmonary nodules in CT imaging, thereby facilitating earlier and more accurate diagnosis of lung cancer.
To address these issues, this paper proposes an improved deep-learning framework named YOLOv8-BCD, based on YOLOv8 and integrating the BiFormer attention mechanism, Content-Aware ReAssembly of Features (CARAFE) up-sampling method, and Depth-wise Over-Parameterized Depth-wise Convolution (DO-DConv) enhanced convolution. To overcome common challenges such as low resolution, noise, and artifacts in lung CT images, the model employs Super-Resolution Generative Adversarial Network (SRGAN)-based image enhancement during preprocessing. The BiFormer attention mechanism is introduced into the backbone to enhance feature extraction capabilities, particularly for small nodules, while CARAFE and DO-DConv modules are incorporated into the head to optimize feature fusion efficiency and reduce computational complexity.
Experimental comparisons using 550 CT images from the LUng Nodule Analysis 2016 dataset (LUNA16 dataset) demonstrated that the proposed YOLOv8-BCD achieved detection accuracy and mean average precision (mAP) at an intersection over union (IoU) threshold of 0.5 (mAP) of 86.4% and 88.3%, respectively, surpassing YOLOv8 by 2.2% in accuracy, 4.5% in mAP. Additional evaluation on the external TianChi lung nodule dataset further confirmed the model's generalization capability, achieving an mAP of 83.8% and mAP of 43.9% with an inference speed of 98 frames per second (FPS).
The YOLOv8-BCD model effectively assists clinicians by significantly reducing interpretation time, improving diagnostic accuracy, and minimizing the risk of missed diagnoses, thereby enhancing patient outcomes.
肺癌仍然是全球发病率和死亡率最高的恶性肿瘤之一。在计算机断层扫描(CT)图像中检测肺结节对于早期肺癌筛查至关重要。然而,传统的检测方法往往准确性和效率较低,限制了它们的临床效果。本研究旨在设计一种先进的深度学习框架,能够在CT成像中实现高精度、快速识别肺结节,从而促进肺癌的更早、更准确诊断。
为了解决这些问题,本文提出了一种改进的深度学习框架YOLOv8-BCD,它基于YOLOv8,并集成了BiFormer注意力机制、特征内容感知重组(CARAFE)上采样方法和深度过参数化深度卷积(DO-DConv)增强卷积。为了克服肺CT图像中低分辨率、噪声和伪影等常见挑战,该模型在预处理过程中采用基于超分辨率生成对抗网络(SRGAN)的图像增强。将BiFormer注意力机制引入主干以增强特征提取能力,特别是对于小结节,同时将CARAFE和DO-DConv模块纳入头部以优化特征融合效率并降低计算复杂度。
使用来自2016年肺结节分析数据集(LUNA16数据集)的550张CT图像进行的实验比较表明,所提出的YOLOv8-BCD在交并比(IoU)阈值为0.5时的检测准确率和平均精度均值(mAP)分别达到86.4%和88.3%,在准确率上比YOLOv8高2.2%,在mAP上高4.5%。在外部天池肺结节数据集上进行的额外评估进一步证实了该模型的泛化能力,在推理速度为每秒98帧(FPS)时,mAP达到83.8%,mAP达到43.9%。
YOLOv8-BCD模型通过显著减少解读时间、提高诊断准确性和最小化漏诊风险,有效地协助临床医生,从而改善患者预后。