Department of Radiology, University Hospital, LMU Munich, Munich, Germany.
Institute of Diagnostic and Interventional Neuroradiology, University Hospital, LMU Munich, Munich, Germany.
Crit Care Med. 2020 Jul;48(7):e574-e583. doi: 10.1097/CCM.0000000000004397.
Interpretation of lung opacities in ICU supine chest radiographs remains challenging. We evaluated a prototype artificial intelligence algorithm to classify basal lung opacities according to underlying pathologies.
Retrospective study. The deep neural network was trained on two publicly available datasets including 297,541 images of 86,876 patients.
One hundred sixty-six patients received both supine chest radiograph and CT scans (reference standard) within 90 minutes without any intervention in between.
Algorithm accuracy was referenced to board-certified radiologists who evaluated supine chest radiographs according to side-separate reading scores for pneumonia and effusion (0 = absent, 1 = possible, and 2 = highly suspected). Radiologists were blinded to the supine chest radiograph findings during CT interpretation. Performances of radiologists and the artificial intelligence algorithm were quantified by receiver-operating characteristic curve analysis. Diagnostic metrics (sensitivity, specificity, positive predictive value, negative predictive value, and accuracy) were calculated based on different receiver-operating characteristic operating points. Regarding pneumonia detection, radiologists achieved a maximum diagnostic accuracy of up to 0.87 (95% CI, 0.78-0.93) when considering only the supine chest radiograph reading score 2 as positive for pneumonia. Radiologist's maximum sensitivity up to 0.87 (95% CI, 0.76-0.94) was achieved by additionally rating the supine chest radiograph reading score 1 as positive for pneumonia and taking previous examinations into account. Radiologic assessment essentially achieved nonsignificantly higher results compared with the artificial intelligence algorithm: artificial intelligence-area under the receiver-operating characteristic curve of 0.737 (0.659-0.815) versus radiologist's area under the receiver-operating characteristic curve of 0.779 (0.723-0.836), diagnostic metrics of receiver-operating characteristic operating points did not significantly differ. Regarding the detection of pleural effusions, there was no significant performance difference between radiologist's and artificial intelligence algorithm: artificial intelligence-area under the receiver-operating characteristic curve of 0.740 (0.662-0.817) versus radiologist's area under the receiver-operating characteristic curve of 0.698 (0.646-0.749) with similar diagnostic metrics for receiver-operating characteristic operating points.
Considering the minor level of performance differences between the algorithm and radiologists, we regard artificial intelligence as a promising clinical decision support tool for supine chest radiograph examinations in the clinical routine with high potential to reduce the number of missed findings in an artificial intelligence-assisted reading setting.
在 ICU 仰卧位胸部 X 线片中,肺不透明度的解读仍然具有挑战性。我们评估了一种人工智能算法原型,以根据潜在病理学对基底肺不透明度进行分类。
回顾性研究。该深度学习网络在两个公开可用的数据集上进行了训练,其中包括 86876 名患者的 297541 张图像。
166 名患者在 90 分钟内同时接受仰卧位胸部 X 线片和 CT 扫描(参考标准),两者之间无任何干预。
算法准确性参考了经董事会认证的放射科医生,他们根据肺炎和胸腔积液的侧位阅读评分(0 = 无,1 = 可能,2 = 高度可疑)对仰卧位胸部 X 线片进行评估。放射科医生在解释 CT 时对仰卧位胸部 X 线片的结果一无所知。使用受试者工作特征曲线分析对放射科医生和人工智能算法的性能进行量化。基于不同的受试者工作特征操作点,计算了诊断指标(灵敏度、特异性、阳性预测值、阴性预测值和准确性)。考虑到仅将肺炎的仰卧位 X 线片阅读评分 2 视为阳性,放射科医生的诊断准确性最高可达 0.87(95%CI,0.78-0.93)。放射科医生的最大灵敏度最高可达 0.87(95%CI,0.76-0.94),方法是将仰卧位 X 线片阅读评分 1 也视为肺炎阳性,并考虑到以前的检查。放射学评估与人工智能算法相比,结果基本没有显著提高:人工智能的受试者工作特征曲线下面积为 0.737(0.659-0.815),而放射科医生的受试者工作特征曲线下面积为 0.779(0.723-0.836),受试者工作特征操作点的诊断指标无显著差异。关于胸腔积液的检测,放射科医生和人工智能算法之间没有显著的性能差异:人工智能的受试者工作特征曲线下面积为 0.740(0.662-0.817),而放射科医生的受试者工作特征曲线下面积为 0.698(0.646-0.749),受试者工作特征操作点的诊断指标相似。
考虑到算法和放射科医生之间的性能差异较小,我们认为人工智能是一种有前途的临床决策支持工具,可用于临床常规中的仰卧位胸部 X 线检查,具有在人工智能辅助阅读环境中减少漏诊的巨大潜力。