Remedy Logic, 1177 Avenue of the Americas, 5th Floor, New York, NY, 10036, USA.
Hospital for Special Surgery, 535 East 70th Street, New York, NY, 10021, USA.
Eur Spine J. 2024 Mar;33(3):941-948. doi: 10.1007/s00586-023-08089-2. Epub 2023 Dec 27.
To develop a three-stage convolutional neural network (CNN) approach to segment anatomical structures, classify the presence of lumbar spinal stenosis (LSS) for all 3 stenosis types: central, lateral recess and foraminal and assess its severity on spine MRI and to demonstrate its efficacy as an accurate and consistent diagnostic tool.
The three-stage model was trained on 1635 annotated lumbar spine MRI studies consisting of T2-weighted sagittal and axial planes at each vertebral level. Accuracy of the model was evaluated on an external validation set of 150 MRI studies graded on a scale of absent, mild, moderate or severe by a panel of 7 radiologists. The reference standard for all types was determined by majority voting and in case of disagreement, adjudicated by an external radiologist. The radiologists' diagnoses were then compared to the diagnoses of the model.
The model showed comparable performance to the radiologist average both in terms of the determination of presence/absence of LSS as well as severity classification, for all 3 stenosis types. In the case of central canal stenosis, the sensitivity, specificity and AUROC of the CNN were (0.971, 0.864, 0.963) for binary (presence/absence) classification compared to the radiologist average of (0.786, 0.899, 0.842). For lateral recess stenosis, the sensitivity, specificity and AUROC of the CNN were (0.853, 0.787, 0.907) compared to the radiologist average of (0.713, 0.898, 805). For foraminal stenosis, the sensitivity, specificity and AUROC of the CNN were (0.942, 0.844, 0.950) compared to the radiologist average of (0.879, 0.877, 0.878). Multi-class severity classifications showed similarly comparable statistics.
The CNN showed comparable performance to radiologist subspecialists for the detection and classification of LSS. The integration of neural network models in the detection of LSS could bring higher accuracy, efficiency, consistency, and post-hoc interpretability in diagnostic practices.
开发一种三阶段卷积神经网络(CNN)方法来分割解剖结构,对所有 3 种狭窄类型(中央型、侧隐窝型和椎间孔型)的腰椎椎管狭窄症(LSS)进行分类,并评估其在脊柱 MRI 上的严重程度,并证明其作为一种准确和一致的诊断工具的有效性。
该三阶段模型在由每个椎骨水平的 T2 加权矢状面和轴位组成的 1635 个标注腰椎 MRI 研究上进行了训练。该模型的准确性在由 7 位放射科医生组成的小组对 150 个 MRI 研究进行分级的缺失、轻度、中度或重度的外部验证集上进行了评估。所有类型的参考标准都是通过多数投票确定的,如果存在分歧,则由外部放射科医生进行裁决。然后将放射科医生的诊断与模型的诊断进行比较。
该模型在确定 LSS 的存在/缺失以及严重程度分类方面,与放射科医生的平均水平相比,在所有 3 种狭窄类型上都表现出了相当的性能。在中央椎管狭窄的情况下,CNN 的敏感性、特异性和 AUROC 分别为(0.971、0.864、0.963),用于二进制(存在/缺失)分类,而放射科医生的平均值为(0.786、0.899、0.842)。对于侧隐窝狭窄,CNN 的敏感性、特异性和 AUROC 分别为(0.853、0.787、0.907),而放射科医生的平均值为(0.713、0.898、805)。对于椎间孔狭窄,CNN 的敏感性、特异性和 AUROC 分别为(0.942、0.844、0.950),而放射科医生的平均值为(0.879、0.877、0.878)。多类严重程度分类显示出类似的可比统计数据。
CNN 在检测和分类 LSS 方面与放射科专家表现相当。在 LSS 的检测中整合神经网络模型,可以在诊断实践中提高准确性、效率、一致性和后处理可解释性。