Jiang Yifan, Ebrahimpour Leyla, Després Philippe, Manem Venkata Sk
Centre de recherche du CHU de Québec-Université Laval, Quebec City, Canada.
Département de biologie moléculaire, de biochimie médicale et de pathologie, Université Laval, Quebec City, Canada.
Sci Rep. 2025 Jan 11;15(1):1736. doi: 10.1038/s41598-024-84193-7.
Deep learning (DL) methods have demonstrated remarkable effectiveness in assisting with lung cancer risk prediction tasks using computed tomography (CT) scans. However, the lack of comprehensive comparison and validation of state-of-the-art (SOTA) models in practical settings limits their clinical application. This study aims to review and analyze current SOTA deep learning models for lung cancer risk prediction (malignant-benign classification). To evaluate our model's general performance, we selected 253 out of 467 patients from a subset of the National Lung Screening Trial (NLST) who had CT scans without contrast, which are the most commonly used, and divided them into training and test cohorts. The CT scans were preprocessed into 2D-image and 3D-volume formats according to their nodule annotations. We evaluated ten 3D and eleven 2D SOTA deep learning models, which were pretrained on large-scale general-purpose datasets (Kinetics and ImageNet) and radiological datasets (3DSeg-8, nnUnet and RadImageNet), for their lung cancer risk prediction performance. Our results showed that 3D-based deep learning models generally perform better than 2D models. On the test cohort, the best-performing 3D model achieved an AUROC of 0.86, while the best 2D model reached 0.79. The lowest AUROCs for the 3D and 2D models were 0.70 and 0.62, respectively. Furthermore, pretraining on large-scale radiological image datasets did not show the expected performance advantage over pretraining on general-purpose datasets. Both 2D and 3D deep learning models can handle lung cancer risk prediction tasks effectively, although 3D models generally have superior performance than their 2D competitors. Our findings highlight the importance of carefully selecting pretrained datasets and model architectures for lung cancer risk prediction. Overall, these results have important implications for the development and clinical integration of DL-based tools in lung cancer screening.
深度学习(DL)方法在使用计算机断层扫描(CT)协助肺癌风险预测任务方面已显示出显著成效。然而,在实际应用中,缺乏对当前最先进(SOTA)模型的全面比较和验证限制了它们的临床应用。本研究旨在回顾和分析用于肺癌风险预测(恶性-良性分类)的当前SOTA深度学习模型。为了评估我们模型的总体性能,我们从国家肺癌筛查试验(NLST)的一个子集中选取了467名患者中的253名,这些患者进行了最常用的非增强CT扫描,并将他们分为训练组和测试组。根据结节标注,将CT扫描预处理为二维图像和三维体数据格式。我们评估了十个三维和十一个二维SOTA深度学习模型,这些模型在大规模通用数据集(Kinetics和ImageNet)以及放射学数据集(3DSeg-8、nnUnet和RadImageNet)上进行了预训练,以评估它们的肺癌风险预测性能。我们的结果表明,基于三维的深度学习模型通常比二维模型表现更好。在测试组中,表现最佳的三维模型的曲线下面积(AUROC)达到0.86,而最佳的二维模型为0.79。三维和二维模型的最低AUROC分别为0.70和0.62。此外,在大规模放射学图像数据集上进行预训练并没有显示出比在通用数据集上预训练预期的性能优势。尽管三维模型通常比其二维竞争对手具有更优的性能,但二维和三维深度学习模型都可以有效地处理肺癌风险预测任务。我们的研究结果强调了为肺癌风险预测仔细选择预训练数据集和模型架构的重要性。总体而言,这些结果对基于深度学习的工具在肺癌筛查中的开发和临床整合具有重要意义。