Arakawa Shoutaro, Shinohara Akira, Arimura Daigo, Fukuda Takeshi, Takumi Yukihiro, Nishino Kazuyoshi, Saito Mitsuru
Department of Orthopaedic Surgery, The Jikei University School of Medicine, Tokyo 105-8461, Japan.
Department of Radiology, The Jikei University School of Medicine, Tokyo 105-8461, Japan.
JBMR Plus. 2025 Jan 25;9(4):ziaf017. doi: 10.1093/jbmrpl/ziaf017. eCollection 2025 Apr.
This exploratory study developed and evaluated an artificial intelligence (AI)-based algorithm for quantitative morphometry to assess vertebral body deformities indicative of fractures. To achieve this, 709 radiographs from 355 cases were utilized for algorithm development and performance evaluation. The proposed algorithm integrates a first-stage AI model to identify the positions of thoracic and lumber vertebral bodies in lateral radiographs and a second-stage AI model to annotate 6 landmarks for calculating vertebral body height ratios (, , and ). The first-stage AI model achieved a sensitivity of 97.6%, a precision of 95.1%, and an average false-positive ratio of 0.43 per image for vertebral body detection. In the second stage, the algorithm's performance was evaluated using an independent dataset of vertebrae annotated by 2 spine surgeons and 1 radiologist. The average landmark errors ranged from 2.9% to 3.3% on the X-axis and 2.9% to 4.0% on the Y-axis, with errors increasing in more severely collapsed vertebrae, particularly at central landmarks. Spearman's correlation coefficients were 0.519-0.589 for , 0.558-0.647 for , and 0.735-0.770 for , comparable with correlations observed among human evaluators. Bland-Altman analysis revealed systematic bias in some cases, indicating that the algorithm underestimated anterior and central height collapse in deformed vertebrae. However, the mean differences and limits of agreement between the algorithm and external evaluators were similar to those among the evaluators. Additionally, the algorithm processed each image within 10 s. These findings suggest that the algorithm performs comparably with human evaluators, demonstrating sufficient accuracy for clinical use. The proposed approach has the potential to enhance patient care by being widely adopted in clinical settings.
本探索性研究开发并评估了一种基于人工智能(AI)的算法,用于定量形态测量以评估提示骨折的椎体畸形。为此,利用了来自355例病例的709张X线片进行算法开发和性能评估。所提出的算法整合了一个第一阶段的AI模型,用于识别侧位X线片中胸椎和腰椎椎体的位置,以及一个第二阶段的AI模型,用于标注6个标志点以计算椎体高度比(、和)。第一阶段的AI模型在椎体检测方面实现了97.6%的灵敏度、95.1%的精度以及每张图像平均0.43的假阳性率。在第二阶段,使用由2名脊柱外科医生和1名放射科医生标注的独立椎体数据集评估算法的性能。平均标志点误差在X轴上为2.9%至3.3%,在Y轴上为2.9%至4.0%,在塌陷更严重的椎体中误差会增加,尤其是在中央标志点处。对于,Spearman相关系数为0.519 - 0.589;对于,为0.558 - 0.647;对于,为0.735 - 0.770,与人类评估者之间观察到的相关性相当。Bland - Altman分析显示在某些情况下存在系统偏差,表明该算法低估了变形椎体的前部和中部高度塌陷。然而,该算法与外部评估者之间的平均差异和一致性界限与评估者之间的相似。此外,该算法在10秒内处理每张图像。这些发现表明该算法的表现与人类评估者相当,具有足够的临床使用准确性。所提出的方法有可能通过在临床环境中广泛应用来改善患者护理。