Department of Radiology, University of Groningen, University Medical Center of Groningen, 9713GZ Groningen, The Netherlands.
Department of Radiation Oncology, University of Groningen, University Medical Center of Groningen, 9713GZ Groningen, The Netherlands; Data Science in Health (DASH), University of Groningen, University Medical Center of Groningen, 9713GZ Groningen, The Netherlands.
Eur J Radiol. 2023 Oct;167:111067. doi: 10.1016/j.ejrad.2023.111067. Epub 2023 Aug 26.
To evaluate the performance of artificial intelligence (AI) software for automatic thoracic aortic diameter assessment in a heterogeneous cohort with low-dose, non-contrast chest computed tomography (CT).
Participants of the Imaging in Lifelines (ImaLife) study who underwent low-dose, non-contrast chest CT (August 2017-May 2022) were included using random samples of 80 participants <50y, ≥80y, and with thoracic aortic diameter ≥40 mm. AI-based aortic diameters at eight guideline compliant positions were compared with manual measurements. In 90 examinations (30 per group) diameters were reassessed for intra- and inter-reader variability, which was compared to discrepancy of the AI system using Bland-Altman analysis, paired samples t-testing and linear mixed models.
We analyzed 240 participants (63 ± 16 years; 50 % men). AI evaluation failed in 11 cases due to incorrect segmentation (4.6 %), leaving 229 cases for analysis. No difference was found in aortic diameter between manual and automatic measurements (32.7 ± 6.4 mm vs 32.7 ± 6.0 mm, p = 0.70). Bland-Altman analysis yielded no systematic bias and a repeatability coefficient of 4.0 mm for AI. Mean discrepancy of AI (1.3 ± 1.6 mm) was comparable to inter-reader variability (1.4 ± 1.4 mm); only at the proximal aortic arch showed AI higher discrepancy (2.0 ± 1.8 mm vs 0.9 ± 0.9 mm, p < 0.001). No difference between AI discrepancy and inter-reader variability was found for any subgroup (all: p > 0.05).
The AI software can accurately measure thoracic aortic diameters, with discrepancy to a human reader similar to inter-reader variability in a range from normal to dilated aortas.
评估人工智能 (AI) 软件在低剂量非对比胸部 CT 检查的异质队列中自动评估胸主动脉直径的性能。
使用低剂量非对比胸部 CT(2017 年 8 月至 2022 年 5 月)的 Imaging in Lifelines(ImaLife)研究参与者的随机样本,包括 80 名年龄 <50 岁、≥80 岁和胸主动脉直径≥40mm 的参与者。比较了基于 AI 的 8 个指南一致位置的主动脉直径与手动测量值。在 90 次检查(每组 30 次)中,对观察者内和观察者间变异性进行了重新评估,并通过 Bland-Altman 分析、配对样本 t 检验和线性混合模型比较与 AI 系统差异。
我们分析了 240 名参与者(63±16 岁;50%为男性)。由于分割不正确(4.6%),AI 评估在 11 例中失败,留下 229 例进行分析。手动和自动测量的主动脉直径没有差异(32.7±6.4mm 与 32.7±6.0mm,p=0.70)。Bland-Altman 分析未发现系统偏差,AI 的重复性系数为 4.0mm。AI 的平均差异(1.3±1.6mm)与观察者间变异性相当(1.4±1.4mm);仅在主动脉弓近端显示 AI 差异更大(2.0±1.8mm 与 0.9±0.9mm,p<0.001)。在任何亚组中,AI 差异与观察者间变异性均无差异(所有:p>0.05)。
AI 软件可以准确测量胸主动脉直径,与人工读者的差异与正常至扩张主动脉范围内的观察者间变异性相似。