Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, 101, Daehak-ro, Jongno-gu, Seoul, 03080, South Korea.
Institute of Radiation Medicine, Seoul National University Medical Research Center, 101, Daehak-ro, Jongno-gu, Seoul, 03080, South Korea.
Eur Radiol. 2020 Jun;30(6):3295-3305. doi: 10.1007/s00330-019-06628-4. Epub 2020 Feb 13.
To evaluate the deep learning models for differentiating invasive pulmonary adenocarcinomas (IACs) among subsolid nodules (SSNs) considered for resection in a retrospective diagnostic cohort in comparison with a size-based logistic model and expert radiologists.
This study included 525 patients (309 women; median, 62 years) to develop models, and an independent cohort of 101 patients (57 women; median, 66 years) was used for validation. A size-based logistic model and deep learning models using 2.5-dimension (2.5D) and three-dimension (3D) CT images were developed to discriminate IAC from less invasive pathologies. Overall performance, discrimination, and calibration were assessed. Diagnostic performances of the three thoracic radiologists were compared with those of the deep learning model.
The overall performances of the deep learning models (Brier score, 0.122 for the 2.5D DenseNet and 0.121 for the 3D DenseNet) were superior to those of the size-based logistic model (Brier score, 0.198). The area under the receiver operating characteristic curve (AUC) of the 2.5D DenseNet (0.921) was significantly higher than that of the 3D DenseNet (0.835; p = 0.037) and the size-based logistic model (0.836; p = 0.009). At equally high sensitivities of 90%, the 2.5D DenseNet showed significantly higher specificity (88.2%; all p < 0.05) and positive predictive value (97.4%; all p < 0.05) than other models. Model calibration was poor for all models (all p < 0.05). The 2.5D DenseNet had a comparable performance with the radiologists (AUC, 0.848-0.910).
The 2.5D DenseNet model could be used as a highly sensitive and specific diagnostic tool to differentiate IACs among SSNs for surgical candidates.
• The deep learning model developed using 2.5D DenseNet showed higher overall performance and discrimination than the size-based logistic model for the differentiation of invasive adenocarcinomas among subsolid nodules for surgical candidates. • The 2.5D DenseNet demonstrated a thoracic radiologist-level diagnostic performance and had higher specificity (88.2%) at equal sensitivities (90%) than the size-based logistic model (specificity, 52.9%). • The 2.5D DenseNet could be used to reduce potential overtreatment for the indolent subsolid nodules or to select candidates for sublobar resection instead of the standard lobectomy.
在回顾性诊断队列中,与基于大小的逻辑模型和专家放射科医生相比,评估深度学习模型在区分考虑切除的亚实性结节(SSNs)中的浸润性肺腺癌(IAC)方面的性能。
本研究纳入了 525 名患者(309 名女性;中位年龄 62 岁)用于模型开发,并使用了 101 名患者(57 名女性;中位年龄 66 岁)的独立队列进行验证。建立了基于大小的逻辑模型和使用 2.5 维(2.5D)和 3 维(3D)CT 图像的深度学习模型,以区分 IAC 与侵袭性较低的病变。评估了整体性能、区分度和校准度。比较了三位胸部放射科医生和深度学习模型的诊断性能。
深度学习模型(2.5D DenseNet 的 Brier 评分为 0.122,3D DenseNet 的 Brier 评分为 0.121)的整体性能优于基于大小的逻辑模型(Brier 评分 0.198)。2.5D DenseNet 的受试者工作特征曲线(ROC)曲线下面积(AUC)(0.921)显著高于 3D DenseNet(0.835;p=0.037)和基于大小的逻辑模型(0.836;p=0.009)。在同样高的敏感性为 90%时,2.5D DenseNet 显示出更高的特异性(88.2%;所有 p<0.05)和阳性预测值(97.4%;所有 p<0.05),优于其他模型。所有模型的校准度均较差(所有 p<0.05)。2.5D DenseNet 与放射科医生的表现相当(AUC:0.848-0.910)。
2.5D DenseNet 模型可作为一种高度敏感和特异的诊断工具,用于区分手术候选者的 SSNs 中的 IAC。
与基于大小的逻辑模型相比,使用 2.5D DenseNet 开发的深度学习模型在区分手术候选者的亚实性结节中的浸润性腺癌方面表现出更高的整体性能和区分度。
2.5D DenseNet 显示出与放射科医生相当的诊断性能,在相同敏感性(90%)时,特异性(88.2%)高于基于大小的逻辑模型(特异性 52.9%)。
2.5D DenseNet 可用于减少对惰性亚实性结节的过度治疗,或选择亚叶切除术而不是标准肺叶切除术的候选者。