Department of Rehabilitation Science and Health Technology, Faculty of Health Science, OsloMet-Oslo Metropolitan University, Oslo, Norway.
Research and Communication Unit for Musculoskeletal Health (FORMI), Division of Clinical Neuroscience, Oslo University Hospital, Oslo, Norway.
Pain. 2023 Dec 1;164(12):2759-2768. doi: 10.1097/j.pain.0000000000002974. Epub 2023 Jul 24.
Prognostic prediction models for 3 different definitions of nonrecovery were developed in the Back Complaints in the Elders study in the Netherlands. The models' performance was good (optimism-adjusted area under receiver operating characteristics [AUC] curve ≥0.77, R2 ≥0.3). This study aimed to assess the external validity of the 3 prognostic prediction models in the Norwegian Back Complaints in the Elders study. We conducted a prospective cohort study, including 452 patients aged ≥55 years, seeking primary care for a new episode of back pain. Nonrecovery was defined for 2 outcomes, combining 6- and 12-month follow-up data: Persistent back pain (≥3/10 on numeric rating scale) and persistent disability (≥4/24 on Roland-Morris Disability Questionnaire). We could not assess the third model (self-reported nonrecovery) because of substantial missing data (>50%). The models consisted of biopsychosocial prognostic factors. First, we assessed Nagelkerke R2 , discrimination (AUC) and calibration (calibration-in-the-large [CITL], slope, and calibration plot). Step 2 was to recalibrate the models based on CITL and slope. Step 3 was to reestimate the model coefficients and assess if this improved performance. The back pain model demonstrated acceptable discrimination (AUC 0.74, 95% confidence interval: 0.69-0.79), and R2 was 0.23. The disability model demonstrated excellent discrimination (AUC 0.81, 95% confidence interval: 0.76-0.85), and R2 was 0.35. Both models had poor calibration (CITL <0, slope <1). Recalibration yielded acceptable calibration for both models, according to the calibration plots. Step 3 did not improve performance substantially. The recalibrated models may need further external validation, and the models' clinical impact should be assessed.
在荷兰的 Back Complaints in the Elders 研究中,针对 3 种不同的非恢复定义开发了预后预测模型。这些模型的性能较好(经乐观调整的接受者操作特征曲线下面积[AUC]曲线≥0.77,R2≥0.3)。本研究旨在评估 3 种预后预测模型在挪威 Back Complaints in the Elders 研究中的外部有效性。我们进行了一项前瞻性队列研究,纳入了 452 名年龄≥55 岁、因新发腰痛到初级保健就诊的患者。非恢复的定义是基于 6 个月和 12 个月随访数据的 2 个结局:持续性腰痛(数字评分量表≥3/10)和持续性残疾(Roland-Morris 残疾问卷≥4/24)。由于大量数据缺失(>50%),我们无法评估第三个模型(自我报告的非恢复)。这些模型包含生物心理社会预后因素。首先,我们评估了 Nagelkerke R2、区分度(AUC)和校准(大校准[CITL]、斜率和校准图)。第二步是根据 CITL 和斜率重新校准模型。第三步是重新估计模型系数,并评估这是否可以改善性能。腰痛模型的区分度可接受(AUC 0.74,95%置信区间:0.69-0.79),R2 为 0.23。残疾模型的区分度极好(AUC 0.81,95%置信区间:0.76-0.85),R2 为 0.35。两个模型的校准均较差(CITL<0,斜率<1)。根据校准图,重新校准后两个模型的校准均有所改善。第三步并没有显著提高性能。重新校准的模型可能需要进一步的外部验证,还应评估模型的临床影响。