Lunit, Inc, Seoul, South Korea.
Division of Thoracic Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts.
JAMA Netw Open. 2022 Aug 1;5(8):e2229289. doi: 10.1001/jamanetworkopen.2022.29289.
The efficient and accurate interpretation of radiologic images is paramount.
To evaluate whether a deep learning-based artificial intelligence (AI) engine used concurrently can improve reader performance and efficiency in interpreting chest radiograph abnormalities.
DESIGN, SETTING, AND PARTICIPANTS: This multicenter cohort study was conducted from April to November 2021 and involved radiologists, including attending radiologists, thoracic radiology fellows, and residents, who independently participated in 2 observer performance test sessions. The sessions included a reading session with AI and a session without AI, in a randomized crossover manner with a 4-week washout period in between. The AI produced a heat map and the image-level probability of the presence of the referrable lesion. The data used were collected at 2 quaternary academic hospitals in Boston, Massachusetts: Beth Israel Deaconess Medical Center (The Medical Information Mart for Intensive Care Chest X-Ray [MIMIC-CXR]) and Massachusetts General Hospital (MGH).
The ground truths for the labels were created via consensual reading by 2 thoracic radiologists. Each reader documented their findings in a customized report template, in which the 4 target chest radiograph findings and the reader confidence of the presence of each finding was recorded. The time taken for reporting each chest radiograph was also recorded. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC) were calculated for each target finding.
A total of 6 radiologists (2 attending radiologists, 2 thoracic radiology fellows, and 2 residents) participated in the study. The study involved a total of 497 frontal chest radiographs-247 from the MIMIC-CXR data set (demographic data for patients were not available) and 250 chest radiographs from MGH (mean [SD] age, 63 [16] years; 133 men [53.2%])-from adult patients with and without 4 target findings (pneumonia, nodule, pneumothorax, and pleural effusion). The target findings were found in 351 of 497 chest radiographs. The AI was associated with higher sensitivity for all findings compared with the readers (nodule, 0.816 [95% CI, 0.732-0.882] vs 0.567 [95% CI, 0.524-0.611]; pneumonia, 0.887 [95% CI, 0.834-0.928] vs 0.673 [95% CI, 0.632-0.714]; pleural effusion, 0.872 [95% CI, 0.808-0.921] vs 0.889 [95% CI, 0.862-0.917]; pneumothorax, 0.988 [95% CI, 0.932-1.000] vs 0.792 [95% CI, 0.756-0.827]). AI-aided interpretation was associated with significantly improved reader sensitivities for all target findings, without negative impacts on the specificity. Overall, the AUROCs of readers improved for all 4 target findings, with significant improvements in detection of pneumothorax and nodule. The reporting time with AI was 10% lower than without AI (40.8 vs 36.9 seconds; difference, 3.9 seconds; 95% CI, 2.9-5.2 seconds; P < .001).
These findings suggest that AI-aided interpretation was associated with improved reader performance and efficiency for identifying major thoracic findings on a chest radiograph.
高效准确地解读放射图像至关重要。
评估在解读胸部 X 光异常时,使用基于深度学习的人工智能(AI)引擎是否可以提高读者的表现和效率。
设计、地点和参与者:这项多中心队列研究于 2021 年 4 月至 11 月进行,涉及放射科医生,包括主治放射科医生、胸部放射科研究员和住院医师,他们独立参与了 2 次观察者表现测试。这些测试包括使用 AI 的阅读会议和没有 AI 的会议,以 4 周的洗脱期随机交叉进行。AI 生成了热图和图像级参考病变存在的概率。使用的数据是在马萨诸塞州波士顿的两家四级学术医院采集的:贝斯以色列女执事医疗中心(重症监护胸部 X 光检查的医学信息集市 [MIMIC-CXR])和马萨诸塞州综合医院(MGH)。
通过两名胸部放射科医生的共识阅读创建了标签的真实情况。每位读者都在定制的报告模板中记录了他们的发现,其中记录了 4 个目标胸部 X 光发现和每个发现的读者置信度。还记录了报告每张胸部 X 光片的时间。为每个目标发现计算了敏感性、特异性和接收者操作特征曲线(AUROC)下的面积。
共有 6 名放射科医生(2 名主治放射科医生、2 名胸部放射科研究员和 2 名住院医师)参与了该研究。该研究共涉及 497 张正位胸部 X 光片-247 张来自 MIMIC-CXR 数据集(患者的人口统计学数据不可用)和 250 张来自 MGH 的胸部 X 光片(平均[标准差]年龄,63[16]岁;133 名男性[53.2%])-来自有和没有 4 个目标发现(肺炎、结节、气胸和胸腔积液)的成年患者。在 497 张胸部 X 光片中发现了目标发现。与读者相比,AI 在所有发现方面都具有更高的敏感性(结节,0.816[95%CI,0.732-0.882] vs 0.567[95%CI,0.524-0.611];肺炎,0.887[95%CI,0.834-0.928] vs 0.673[95%CI,0.632-0.714];胸腔积液,0.872[95%CI,0.808-0.921] vs 0.889[95%CI,0.862-0.917];气胸,0.988[95%CI,0.932-1.000] vs 0.792[95%CI,0.756-0.827])。AI 辅助解读与所有目标发现的读者敏感性显著提高相关,而对特异性没有负面影响。总的来说,所有 4 个目标发现的读者 AUROCs 都有所提高,气胸和结节的检测有显著改善。使用 AI 的报告时间比不使用 AI 时低 10%(40.8 秒对 36.9 秒;差异 3.9 秒;95%CI,2.9-5.2 秒;P<0.001)。
这些发现表明,在识别胸部 X 光上的主要胸部发现时,AI 辅助解读与读者表现和效率的提高有关。