Hoppe Boj Friedrich, Rueckel Johannes, Rudolph Jan, Fink Nicola, Weidert Simon, Hohlbein Wolf, Cavalcanti-Kußmaul Adrian, Trappmann Lena, Munawwar Basel, Ricke Jens, Sabel Bastian Oliver
Department of Radiology, University Hospital, LMU Munich, Marchioninistr. 15, 81377, Munich, Germany.
Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany.
Radiol Med. 2025 Mar;130(3):359-367. doi: 10.1007/s11547-025-01957-5. Epub 2025 Jan 26.
To develop an artificial intelligence (AI) algorithm for automated measurements of spinopelvic parameters on lateral radiographs and compare its performance to multiple experienced radiologists and surgeons.
On lateral full-spine radiographs of 295 consecutive patients, a two-staged region-based convolutional neural network (R-CNN) was trained to detect anatomical landmarks and calculate thoracic kyphosis (TK), lumbar lordosis (LL), sacral slope (SS), and sagittal vertical axis (SVA). Performance was evaluated on 65 radiographs not used for training, which were measured independently by 6 readers (3 radiologists, 3 surgeons), and the median per measurement was set as the reference standard. Intraclass correlation coefficient (ICC), mean absolute error (MAE), and standard deviation (SD) were used for statistical analysis; while, ANOVA was used to search for significant differences between the AI and human readers.
Automatic measurements (AI) showed excellent correlation with the reference standard, with all ICCs within the range of the readers (TK: 0.92 [AI] vs. 0.85-0.96 [readers]; LL: 0.95 vs. 0.87-0.98; SS: 0.93 vs. 0.89-0.98; SVA: 1.00 vs. 0.99-1.00; all p < 0.001). Analysis of the MAE (± SD) revealed comparable results to the six readers (TK: 3.71° (± 4.24) [AI] v.s 1.86-5.88° (± 3.48-6.17) [readers]; LL: 4.53° ± 4.68 vs. 2.21-5.34° (± 2.60-7.38); SS: 4.56° (± 6.10) vs. 2.20-4.76° (± 3.15-7.37); SVA: 2.44 mm (± 3.93) vs. 1.22-2.79 mm (± 2.42-7.11)); while, ANOVA confirmed no significant difference between the errors of the AI and any human reader (all p > 0.05). Human reading time was on average 139 s per case (range: 86-231 s).
Our AI algorithm provides spinopelvic measurements accurate within the variability of experienced readers, but with the potential to save time and increase reproducibility.
开发一种人工智能(AI)算法,用于在腰椎侧位X线片上自动测量脊柱骨盆参数,并将其性能与多位经验丰富的放射科医生和外科医生进行比较。
在295例连续患者的腰椎全脊柱侧位X线片上,训练一个两阶段的基于区域的卷积神经网络(R-CNN),以检测解剖标志并计算胸椎后凸(TK)、腰椎前凸(LL)、骶骨倾斜角(SS)和矢状垂直轴(SVA)。在65张未用于训练的X线片上评估性能,由6名阅片者(3名放射科医生、3名外科医生)独立测量,每次测量的中位数作为参考标准。组内相关系数(ICC)、平均绝对误差(MAE)和标准差(SD)用于统计分析;同时,采用方差分析寻找AI与人类阅片者之间的显著差异。
自动测量(AI)与参考标准显示出极好的相关性,所有ICC均在阅片者范围内(TK:0.92 [AI] 对比 0.85 - 0.96 [阅片者];LL:0.95对比0.87 - 0.98;SS:0.93对比0.89 - 0.98;SVA:1.00对比0.99 - 1.00;所有p < 0.001)。MAE(±SD)分析显示与6名阅片者的结果相当(TK:3.71°(±4.24)[AI] 对比 1.86 - 5.88°(±3.48 - 6.17)[阅片者];LL:4.53°±4.68对比2.21 - 5.34°(±2.60 - 7.38);SS:4.56°(±6.10)对比2.20 - 4.76°(±3.15 - 7.37);SVA:2.44 mm(±3.93)对比1.22 - 2.79 mm(±2.42 - 7.11));同时,方差分析证实AI与任何人类阅片者的误差之间无显著差异(所有p > 0.05)。人类阅片平均每例耗时139秒(范围:86 - 231秒)。
我们的AI算法提供的脊柱骨盆测量结果在经验丰富的阅片者的可变性范围内准确,但有可能节省时间并提高可重复性。