Department of Neurosurgery, University Hospitals Leuven, Herestraat 49, 3000, Leuven, Belgium.
Eur Spine J. 2021 Oct;30(10):2800-2824. doi: 10.1007/s00586-021-06954-6. Epub 2021 Aug 16.
To review the evidence on the relative prognostic performance of the available prognostic scores for survival in spinal metastatic surgery in order to provide a recommendation for use in clinical practice.
A systematic review of comparative external validation studies assessing the performance of prognostic scores for survival in independent cohorts was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines. Eligible studies were identified through Medline and Embase until May 2021. Studies were included when they compared at least four survival scoring systems in surgical or mixed cohorts across all primary tumor types. Predictive performance was assessed based on discrimination and calibration for 3-month, 1-year and overall survival, and generalizability was assessed based on the characteristics of the development cohort and external validation cohorts. Risk of bias and concern regarding applicability were assessed based on the 'Prediction model study Risk Of Bias Assessment Tool' (PROBAST).
Twelve studies fulfilled the inclusion criteria and covered 17 scoring systems across 5.130 patients. Several scores suffer from suboptimal development and validation. The SORG Nomogram, developed in a large surgical cohort, showed good discrimination on 3-month and 1-year survival, good calibration and was superior in direct comparison with low risk of bias and low concern regarding applicability. Machine learning algorithms are promising as they perform equally well in direct comparison. Tokuhashi, Tomita and other traditional risk scores showed suboptimal performance.
The SORG Nomogram and machine learning algorithms outline superior performance in survival prediction for surgery in spinal metastases. Further improvement by comparative validation in large multicenter, prospective cohorts can still be obtained. Given the heterogeneity of spinal metastases, superior methodology of development and validation is key in improving future machine learning systems.
回顾现有用于预测脊柱转移瘤手术患者生存预后的评分系统的相对预后性能的证据,以便为临床实践提供使用建议。
根据 2020 年系统评价和荟萃分析报告的首选报告项目(PRISMA)指南,对评估独立队列中生存评分系统性能的比较外部验证研究进行系统综述。通过 Medline 和 Embase 检索,直至 2021 年 5 月,确定符合条件的研究。当研究比较了所有原发性肿瘤类型的手术或混合队列中至少四种生存评分系统时,将其纳入研究。基于 3 个月、1 年和总生存率的判别和校准评估预测性能,基于开发队列和外部验证队列的特征评估可推广性。基于“预测模型研究风险偏倚评估工具”(PROBAST)评估风险偏倚和适用性问题。
符合纳入标准的研究共有 12 项,涵盖了 5130 例患者的 17 个评分系统。有几个评分系统的开发和验证存在不足。在一个大型手术队列中开发的 SORG 列线图在 3 个月和 1 年生存率方面具有较好的判别能力,校准良好,在直接比较中优于低风险偏倚和低适用性问题。机器学习算法具有良好的性能,在直接比较中表现相当。Tokuhashi、Tomita 和其他传统风险评分的表现欠佳。
SORG 列线图和机器学习算法在脊柱转移瘤手术患者的生存预测方面表现出优越的性能。通过在大型多中心前瞻性队列中进行比较验证,仍可以进一步提高。鉴于脊柱转移瘤的异质性,开发和验证方法的改进是提高未来机器学习系统性能的关键。