Lunit, Seoul, South Korea.
Division of Thoracic Imaging, Department of Radiology, Massachusetts General Hospital, Boston.
JAMA Netw Open. 2020 Sep 1;3(9):e2017135. doi: 10.1001/jamanetworkopen.2020.17135.
The improvement of pulmonary nodule detection, which is a challenging task when using chest radiographs, may help to elevate the role of chest radiographs for the diagnosis of lung cancer.
To assess the performance of a deep learning-based nodule detection algorithm for the detection of lung cancer on chest radiographs from participants in the National Lung Screening Trial (NLST).
DESIGN, SETTING, AND PARTICIPANTS: This diagnostic study used data from participants in the NLST ro assess the performance of a deep learning-based artificial intelligence (AI) algorithm for the detection of pulmonary nodules and lung cancer on chest radiographs using separate training (in-house) and validation (NLST) data sets. Baseline (T0) posteroanterior chest radiographs from 5485 participants (full T0 data set) were used to assess lung cancer detection performance, and a subset of 577 of these images (nodule data set) were used to assess nodule detection performance. Participants aged 55 to 74 years who currently or formerly (ie, quit within the past 15 years) smoked cigarettes for 30 pack-years or more were enrolled in the NLST at 23 US centers between August 2002 and April 2004. Information on lung cancer diagnoses was collected through December 31, 2009. Analyses were performed between August 20, 2019, and February 14, 2020.
Abnormality scores produced by the AI algorithm.
The performance of an AI algorithm for the detection of lung nodules and lung cancer on radiographs, with lung cancer incidence and mortality as primary end points.
A total of 5485 participants (mean [SD] age, 61.7 [5.0] years; 3030 men [55.2%]) were included, with a median follow-up duration of 6.5 years (interquartile range, 6.1-6.9 years). For the nodule data set, the sensitivity and specificity of the AI algorithm for the detection of pulmonary nodules were 86.2% (95% CI, 77.8%-94.6%) and 85.0% (95% CI, 81.9%-88.1%), respectively. For the detection of all cancers, the sensitivity was 75.0% (95% CI, 62.8%-87.2%), the specificity was 83.3% (95% CI, 82.3%-84.3%), the positive predictive value was 3.8% (95% CI, 2.6%-5.0%), and the negative predictive value was 99.8% (95% CI, 99.6%-99.9%). For the detection of malignant pulmonary nodules in all images of the full T0 data set, the sensitivity was 94.1% (95% CI, 86.2%-100.0%), the specificity was 83.3% (95% CI, 82.3%-84.3%), the positive predictive value was 3.4% (95% CI, 2.2%-4.5%), and the negative predictive value was 100.0% (95% CI, 99.9%-100.0%). In digital radiographs of the nodule data set, the AI algorithm had higher sensitivity (96.0% [95% CI, 88.3%-100.0%] vs 88.0% [95% CI, 75.3%-100.0%]; P = .32) and higher specificity (93.2% [95% CI, 89.9%-96.5%] vs 82.8% [95% CI, 77.8%-87.8%]; P = .001) for nodule detection compared with the NLST radiologists. For malignant pulmonary nodule detection on digital radiographs of the full T0 data set, the sensitivity of the AI algorithm was higher (100.0% [95% CI, 100.0%-100.0%] vs 94.1% [95% CI, 82.9%-100.0%]; P = .32) compared with the NLST radiologists, and the specificity (90.9% [95% CI, 89.6%-92.1%] vs 91.0% [95% CI, 89.7%-92.2%]; P = .91), positive predictive value (8.2% [95% CI, 4.4%-11.9%] vs 7.8% [95% CI, 4.1%-11.5%]; P = .65), and negative predictive value (100.0% [95% CI, 100.0%-100.0%] vs 99.9% [95% CI, 99.8%-100.0%]; P = .32) were similar to those of NLST radiologists.
In this study, the AI algorithm performed better than NLST radiologists for the detection of pulmonary nodules on digital radiographs. When used as a second reader, the AI algorithm may help to detect lung cancer.
提高肺结节检测性能,这是使用胸部 X 射线时的一项具有挑战性的任务,可能有助于提升胸部 X 射线在肺癌诊断中的作用。
评估一种基于深度学习的结节检测算法在国家肺癌筛查试验(NLST)参与者的胸部 X 射线上检测肺癌的性能。
设计、地点和参与者:这项诊断研究使用 NLST 参与者的数据,以评估一种基于深度学习的人工智能(AI)算法在使用独立训练(内部)和验证(NLST)数据集的胸部 X 射线上检测肺结节和肺癌的性能。从 5485 名参与者的基线(T0)后前位胸部 X 射线(全 T0 数据集)中评估肺癌检测性能,其中 577 张图像的子集(结节数据集)用于评估结节检测性能。参与者年龄在 55 岁至 74 岁之间,目前或以前(即在过去 15 年内已戒烟)吸烟 30 包年或以上,于 2002 年 8 月至 2004 年 4 月在 23 个美国中心参加 NLST。截至 2009 年 12 月 31 日,收集了肺癌诊断信息。分析于 2019 年 8 月 20 日至 2020 年 2 月 14 日进行。
人工智能算法生成的异常评分。
在 X 射线上检测肺结节和肺癌的 AI 算法的性能,以肺癌发病率和死亡率为主要终点。
共纳入 5485 名参与者(平均[标准差]年龄为 61.7[5.0]岁;3030 名男性[55.2%]),中位随访时间为 6.5 年(四分位距,6.1-6.9 年)。对于结节数据集,人工智能算法检测肺结节的敏感性和特异性分别为 86.2%(95%CI,77.8%-94.6%)和 85.0%(95%CI,81.9%-88.1%)。对于所有癌症的检测,敏感性为 75.0%(95%CI,62.8%-87.2%),特异性为 83.3%(95%CI,82.3%-84.3%),阳性预测值为 3.8%(95%CI,2.6%-5.0%),阴性预测值为 99.8%(95%CI,99.6%-99.9%)。在全 T0 数据集的所有图像中检测恶性肺结节时,敏感性为 94.1%(95%CI,86.2%-100.0%),特异性为 83.3%(95%CI,82.3%-84.3%),阳性预测值为 3.4%(95%CI,2.2%-4.5%),阴性预测值为 100.0%(95%CI,99.9%-100.0%)。在结节数据集的数字 X 射线中,人工智能算法的敏感性(96.0%[95%CI,88.3%-100.0%] vs 88.0%[95%CI,75.3%-100.0%];P = .32)和特异性(93.2%[95%CI,89.9%-96.5%] vs 82.8%[95%CI,77.8%-87.8%];P = .001)均高于 NLST 放射科医生,用于检测结节。在全 T0 数据集的数字 X 射线中检测恶性肺结节时,人工智能算法的敏感性更高(100.0%[95%CI,100.0%-100.0%] vs 94.1%[95%CI,82.9%-100.0%];P = .32),特异性(90.9%[95%CI,89.6%-92.1%] vs 91.0%[95%CI,89.7%-92.2%];P = .91)、阳性预测值(8.2%[95%CI,4.4%-11.9%] vs 7.8%[95%CI,4.1%-11.5%];P = .65)和阴性预测值(100.0%[95%CI,100.0%-100.0%] vs 99.9%[95%CI,99.8%-100.0%];P = .32)与 NLST 放射科医生相似。
在这项研究中,人工智能算法在数字 X 射线上检测肺结节的性能优于 NLST 放射科医生。当作为第二读片者使用时,人工智能算法可能有助于检测肺癌。