Abbott Allan, Pedersen Casper Friis, Hedevik Henrik, Parai Catharina, Gorosito Martin A, Andersen Mikkel, Ingebrigtsen Tor, Solberg Tore K, Grotle Margreth, Berg Bjørnar
Unit of Physiotherapy, Department of Health, Medicine and Caring Sciences, Linköping University, Linköping; Department of Orthopedics, Linköping University Hospital, Linköping, Sweden
3 University of Southern Denmark, Center for Spine Surgery and Research, Spine Center of Southern Denmark, Lillebaelt Hospital, Kolding, Denmark.
Acta Orthop. 2025 Jul 7;96:512-520. doi: 10.2340/17453674.2025.44251.
We aimed to externally validate machine learning models developed in Norway by evaluating their predictive outcome of disability and pain 12 months after lumbar disc herniation surgery in a Swedish and Danish cohort.
Data was extracted for patients undergoing microdiscectomy or open discectomy for lumbar disc herniation in the NORspine, SweSpine and DaneSpine national registries. Outcome of interest was changes in Oswestry disability index (ODI) (≥ 22 points), Numeric Rating Scale (NRS) for back pain (≥ 2 points), and NRS for leg pain (≥ 4 points). Model performance was evaluated by discrimination (C-statistic), calibration, overall fit, and net benefit.
For the ODI model, the NORspine cohort included 22,529 patients, the SweSpine cohort included 10,129 patients, and DaneSpine 5,670 patients. The ODI model's C-statistic varied between 0.76 and 0.81 and calibration slope point estimates varied between 0.84 and 0.99. The C-statistic for NRS back pain varied between 0.70 and 0.76, and calibration slopes varied between 0.79 and 1.03. The C-statistic for NRS leg pain varied between 0.71 and 0.74, and calibration slopes varied between 0.90 and 1.02. There was acceptable overall fit and calibration metrics with minor-modest but explainable heterogeneity observed in the calibration plots. Decision curve analyses displayed clear potential net benefit in treatment in accordance with the prediction models compared with treating all patients or none.
Predictive performance of machine learning models for treatment success/non-success in disability and pain at 12 months post-surgery for lumbar disc herniation showed acceptable discrimination ability, calibration, overall fit, and net benefit reproducible in similar international contexts. Future clinical impact studies are required.
我们旨在通过评估挪威开发的机器学习模型在瑞典和丹麦队列中腰椎间盘突出症手术后12个月的残疾和疼痛预测结果,对其进行外部验证。
从挪威脊柱、瑞典脊柱和丹麦脊柱国家登记处提取接受腰椎间盘突出症显微椎间盘切除术或开放性椎间盘切除术患者的数据。感兴趣的结果是奥斯威斯残疾指数(ODI)(≥22分)、背痛数字评定量表(NRS)(≥2分)和腿痛NRS(≥4分)的变化。通过区分度(C统计量)、校准、整体拟合度和净效益来评估模型性能。
对于ODI模型,挪威脊柱队列包括22529例患者,瑞典脊柱队列包括10129例患者,丹麦脊柱队列包括5670例患者。ODI模型的C统计量在0.76至0.81之间变化,校准斜率点估计值在0.84至0.99之间变化。NRS背痛的C统计量在0.70至0.76之间变化,校准斜率在0.79至1.03之间变化。NRS腿痛的C统计量在0.71至0.74之间变化,校准斜率在0.90至1.02之间变化。整体拟合度和校准指标可接受,在校准图中观察到轻微至适度但可解释的异质性。决策曲线分析显示,与治疗所有患者或不治疗相比,根据预测模型进行治疗具有明显的潜在净效益。
腰椎间盘突出症手术后12个月,用于预测治疗成功/失败的残疾和疼痛的机器学习模型的预测性能显示出可接受的区分能力、校准、整体拟合度和净效益,在类似的国际背景下具有可重复性。需要进行未来的临床影响研究。