Yuh Woon Tak, Khil Eun Kyung, Yoon Yu Sung, Kim Burnyoung, Yoon Hongjun, Lim Jihe, Lee Kyoung Yeon, Yoo Yeong Seo, An Kyeong Deuk
Department of Neurosurgery, Hallym University Dongtan Sacred Heart Hospital, Hwaseong, Korea.
Department of Radiology, Hallym University Dongtan Sacred Heart Hospital, Hwaseong, Korea.
Neurospine. 2024 Mar;21(1):30-43. doi: 10.14245/ns.2347366.683. Epub 2024 Mar 31.
This study aimed to develop and validate a deep learning (DL) algorithm for the quantitative measurement of thoracolumbar (TL) fracture features, and to evaluate its efficacy across varying levels of clinical expertise.
Using the pretrained Mask Region-Based Convolutional Neural Networks model, originally developed for vertebral body segmentation and fracture detection, we fine-tuned the model and added a new module for measuring fracture metrics-compression rate (CR), Cobb angle (CA), Gardner angle (GA), and sagittal index (SI)-from lumbar spine lateral radiographs. These metrics were derived from six-point labeling by 3 radiologists, forming the ground truth (GT). Training utilized 1,000 nonfractured and 318 fractured radiographs, while validations employed 213 internal and 200 external fractured radiographs. The accuracy of the DL algorithm in quantifying fracture features was evaluated against GT using the intraclass correlation coefficient. Additionally, 4 readers with varying expertise levels, including trainees and an attending spine surgeon, performed measurements with and without DL assistance, and their results were compared to GT and the DL model.
The DL algorithm demonstrated good to excellent agreement with GT for CR, CA, GA, and SI in both internal (0.860, 0.944, 0.932, and 0.779, respectively) and external (0.836, 0.940, 0.916, and 0.815, respectively) validations. DL-assisted measurements significantly improved most measurement values, particularly for trainees.
The DL algorithm was validated as an accurate tool for quantifying TL fracture features using radiographs. DL-assisted measurement is expected to expedite the diagnostic process and enhance reliability, particularly benefiting less experienced clinicians.
本研究旨在开发并验证一种用于定量测量胸腰椎(TL)骨折特征的深度学习(DL)算法,并评估其在不同临床专业水平中的有效性。
我们使用最初为椎体分割和骨折检测而开发的预训练基于掩膜区域的卷积神经网络模型,对该模型进行了微调,并添加了一个新模块,用于从腰椎侧位X线片中测量骨折指标——压缩率(CR)、 Cobb角(CA)、 Gardner角(GA)和矢状指数(SI)。这些指标由3名放射科医生进行六点标记得出,构成了金标准(GT)。训练使用了1000张未骨折和318张骨折的X线片,而验证则采用了213张内部和200张外部骨折X线片。使用组内相关系数,对照金标准评估DL算法在量化骨折特征方面的准确性。此外,4名具有不同专业水平的读者,包括实习生和一名脊柱外科主治医生,在有和没有DL辅助的情况下进行测量,并将他们的结果与金标准和DL模型进行比较。
在内部(分别为0.860、0.944、0.932和0.779)和外部(分别为0.836、0.940、0.916和0.815)验证中,DL算法在CR、CA、GA和SI方面与金标准显示出良好到极好的一致性。DL辅助测量显著提高了大多数测量值,尤其是对实习生而言。
DL算法被验证为一种使用X线片定量测量TL骨折特征的准确工具。预计DL辅助测量将加快诊断过程并提高可靠性,尤其有利于经验不足的临床医生。