Lunit, Seoul, Korea.
Division of Thoracic Imaging, Department of Radiology, Massachusetts General Hospital, 75 Blossom Court, Boston, MA, 02114, USA.
Eur Radiol. 2021 Dec;31(12):9664-9674. doi: 10.1007/s00330-021-08074-7. Epub 2021 Jun 4.
Assess if deep learning-based artificial intelligence (AI) algorithm improves reader performance for lung cancer detection on chest X-rays (CXRs).
This reader study included 173 images from cancer-positive patients (n = 98) and 346 images from cancer-negative patients (n = 196) selected from National Lung Screening Trial (NLST). Eight readers, including three radiology residents, and five board-certified radiologists, participated in the observer performance test. AI algorithm provided image-level probability of pulmonary nodule or mass on CXRs and a heatmap of detected lesions. Reader performance was compared with AUC, sensitivity, specificity, false-positives per image (FPPI), and rates of chest CT recommendations.
With AI, the average sensitivity of readers for the detection of visible lung cancer increased for residents, but was similar for radiologists compared to that without AI (0.61 [95% CI, 0.55-0.67] vs. 0.72 [95% CI, 0.66-0.77], p = 0.016 for residents, and 0.76 [95% CI, 0.72-0.81] vs. 0.76 [95% CI, 0.72-0.81, p = 1.00 for radiologists), while false-positive findings per image (FPPI) was similar for residents, but decreased for radiologists (0.15 [95% CI, 0.11-0.18] vs. 0.12 [95% CI, 0.09-0.16], p = 0.13 for residents, and 0.24 [95% CI, 0.20-0.29] vs. 0.17 [95% CI, 0.13-0.20], p < 0.001 for radiologists). With AI, the average rate of chest CT recommendation in patients positive for visible cancer increased for residents, but was similar for radiologists (54.7% [95% CI, 48.2-61.2%] vs. 70.2% [95% CI, 64.2-76.2%], p < 0.001 for residents and 72.5% [95% CI, 68.0-77.1%] vs. 73.9% [95% CI, 69.4-78.3%], p = 0.68 for radiologists), while that in cancer-negative patients was similar for residents, but decreased for radiologists (11.2% [95% CI, 9.6-13.1%] vs. 9.8% [95% CI, 8.0-11.6%], p = 0.32 for residents and 16.4% [95% CI, 14.7-18.2%] vs. 11.7% [95% CI, 10.2-13.3%], p < 0.001 for radiologists).
AI algorithm can enhance the performance of readers for the detection of lung cancers on chest radiographs when used as second reader.
• Reader study in the NLST dataset shows that AI algorithm had sensitivity benefit for residents and specificity benefit for radiologists for the detection of visible lung cancer. • With AI, radiology residents were able to recommend more chest CT examinations (54.7% vs 70.2%, p < 0.001) for patients with visible lung cancer. • With AI, radiologists recommended significantly less proportion of unnecessary chest CT examinations (16.4% vs. 11.7%, p < 0.001) in cancer-negative patients.
评估基于深度学习的人工智能(AI)算法是否可以提高胸部 X 射线(CXR)肺癌检测的读者性能。
本读者研究纳入了来自癌症阳性患者(n=98)的 173 张图像和来自癌症阴性患者(n=196)的 346 张图像,这些图像均选自全国肺癌筛查试验(NLST)。八名读者(包括三名放射科住院医师和五名放射科认证医师)参与了该观察者性能测试。AI 算法在 CXR 上提供了肺结节或肿块的图像级概率和检测到的病变的热图。比较了读者的性能,包括 AUC、敏感性、特异性、每幅图像的假阳性率(FPPI)和胸部 CT 推荐率。
在 AI 的帮助下,居民读者的可见肺癌检测灵敏度提高,而放射科医生的灵敏度与无 AI 时相似(0.61[95%CI,0.55-0.67] vs. 0.72[95%CI,0.66-0.77],p=0.016),而放射科医生的灵敏度则相似(0.76[95%CI,0.72-0.81] vs. 0.76[95%CI,0.72-0.81],p=1.00),而每幅图像的 FPPI 则相似(0.15[95%CI,0.11-0.18] vs. 0.12[95%CI,0.09-0.16],p=0.13),但放射科医生的 FPPI 降低(0.24[95%CI,0.20-0.29] vs. 0.17[95%CI,0.13-0.20],p<0.001)。在 AI 的帮助下,居民患者中可见癌症阳性患者的胸部 CT 推荐率增加,但放射科医生的推荐率相似(54.7%[95%CI,48.2-61.2%] vs. 70.2%[95%CI,64.2-76.2%],p<0.001),而放射科医生的推荐率相似(72.5%[95%CI,68.0-77.1%] vs. 73.9%[95%CI,69.4-78.3%],p=0.68),但放射科医生的推荐率降低(11.2%[95%CI,9.6-13.1%] vs. 9.8%[95%CI,8.0-11.6%],p=0.32),而放射科医生的推荐率相似(16.4%[95%CI,14.7-18.2%] vs. 11.7%[95%CI,10.2-13.3%],p<0.001)。
当作为第二读者使用时,AI 算法可以提高读者对胸部 X 射线肺癌检测的性能。
在 NLST 数据集的读者研究中,AI 算法对居民读者的肺癌检测有敏感性优势,对放射科医生的肺癌检测有特异性优势。
有了 AI,放射科住院医师能够对有可见肺癌的患者推荐更多的胸部 CT 检查(54.7%对 70.2%,p<0.001)。
有了 AI,放射科医生在癌症阴性患者中推荐的不必要胸部 CT 检查比例显著降低(16.4%对 11.7%,p<0.001)。