Department of Radiology, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390.
Health Systems Information Resources, University of Texas Southwestern Health Systems, Dallas, TX.
AJR Am J Roentgenol. 2022 Dec;219(6):895-902. doi: 10.2214/AJR.22.27895. Epub 2022 Jul 13.
Artificial intelligence (AI) algorithms have shown strong performance for detection of pulmonary embolism (PE) on CT examinations performed using a dedicated protocol for PE detection. AI performance is less well studied for detecting PE on examinations ordered for reasons other than suspected PE (i.e., incidental PE [iPE]). The purpose of this study was to assess the diagnostic performance of an AI algorithm for detection of iPE on conventional contrast-enhanced chest CT examinations. This retrospective study included 2555 patients (mean age, 53.2 ± 14.5 [SD] years; 1340 women, 1215 men) who underwent 3003 conventional contrast-enhanced chest CT examinations (i.e., not using pulmonary CTA protocols) between September 2019 and February 2020. A commercial AI algorithm was applied to the images to detect acute iPE. A vendor-supplied natural language processing (NLP) algorithm was applied to the clinical reports to identify examinations interpreted as positive for iPE. For all examinations that were positive by the AI-based image review or by NLP-based report review, a multireader adjudication process was implemented to establish a reference standard for iPE. Images were also reviewed to identify explanations of AI misclassifications. On the basis of the adjudication process, the frequency of iPE was 1.3% (40/3003). AI detected four iPEs missed by clinical reports, and clinical reports detected seven iPEs missed by AI. AI, compared with clinical reports, exhibited significantly lower PPV (86.8% vs 97.3%, = .03) and specificity (99.8% vs 100.0%, = .045). Differences in sensitivity (82.5% vs 90.0%, = .37) and NPV (99.8% vs 99.9%, = .36) were not significant. For AI, neither sensitivity nor specificity varied significantly in association with age, sex, patient status, or cancer-related clinical scenario (all > .05). Explanations of false-positives by AI included metastatic lymph nodes and pulmonary venous filling defect, and explanations of false-negatives by AI included surgically altered anatomy and small-caliber subsegmental vessels. AI had high NPV and moderate PPV for iPE detection, detecting some iPEs missed by radiologists. Potential applications of the AI tool include serving as a second reader to help detect additional iPEs or as a worklist triage tool to allow earlier iPE detection and intervention. Various explanations of AI misclassifications may provide targets for model improvement.
人工智能(AI)算法在使用专门用于检测肺栓塞(PE)的协议进行 CT 检查时,对检测 PE 表现出强大的性能。在因疑似 PE 以外的原因(即偶然发现的 PE [iPE])而进行的检查中,对 AI 检测 PE 的性能研究较少。本研究的目的是评估一种 AI 算法在常规增强胸部 CT 检查中检测 iPE 的诊断性能。本回顾性研究纳入了 2555 例患者(平均年龄 53.2±14.5[SD]岁;女性 1340 例,男性 1215 例),这些患者在 2019 年 9 月至 2020 年 2 月期间进行了 3003 例常规增强胸部 CT 检查(即未使用肺 CTA 方案)。将商业 AI 算法应用于图像,以检测急性 iPE。使用供应商提供的自然语言处理(NLP)算法来识别报告中解释为 iPE 阳性的检查。对于所有通过 AI 图像审查或 NLP 报告审查呈阳性的检查,都实施了多读者裁决过程,以建立 iPE 的参考标准。还对图像进行了审查,以确定 AI 分类错误的原因。根据裁决过程,iPE 的频率为 1.3%(40/3003)。AI 检测到 4 例临床报告漏诊的 iPE,而临床报告检测到 7 例 AI 漏诊的 iPE。与临床报告相比,AI 的阳性预测值(86.8%对 97.3%, =.03)和特异性(99.8%对 100.0%, =.045)明显较低。敏感性(82.5%对 90.0%, =.37)和阴性预测值(99.8%对 99.9%, =.36)的差异无统计学意义。对于 AI 来说,年龄、性别、患者状态或与癌症相关的临床情况(均 >.05)均与敏感性或特异性无显著相关性。AI 假阳性的解释包括转移性淋巴结和肺静脉充盈缺损,AI 假阴性的解释包括手术改变的解剖结构和小口径亚段血管。AI 对 iPE 的检测具有高阴性预测值和中等阳性预测值,可检测到一些放射科医生漏诊的 iPE。该 AI 工具的潜在应用包括作为第二读者以帮助检测额外的 iPE,或作为工作清单分诊工具以更早地检测和干预 iPE。AI 分类错误的各种解释可能为模型改进提供目标。