利用国家外科手术数据库预测腰椎后路手术后的并发症，并比较曲线下面积和 F1 评分评估预测能力。

Using a national surgical database to predict complications following posterior lumbar surgery and comparing the area under the curve and F1-score for the assessment of prognostic capability.

机构信息

Ottawa Spine Collaborative Analytics Network, The Ottawa Hospital, Ottawa, ON, Canada K1Y 4E9.

Ottawa Spine Collaborative Analytics Network, The Ottawa Hospital, Ottawa, ON, Canada K1Y 4E9; Ottawa Hospital Research Institute, Ottawa, ON, Canada K1Y 4E9.

出版信息

Spine J. 2021 Jul;21(7):1135-1142. doi: 10.1016/j.spinee.2021.02.007. Epub 2021 Feb 16.

DOI:10.1016/j.spinee.2021.02.007

PMID:33601012

Abstract

BACKGROUND

With spinal surgery rates increasing in North America, models that are able to accurately predict which patients are at greater risk of developing complications are highly warranted. However, the previously published methods which have used large, multi-centre databases to develop their prediction models have relied on the receiver operator characteristics curve with the associated area under the curve (AUC) to assess their model's performance. Recently, it has been found that a precision-recall curve with the associated F1-score could provide a more realistic analysis for these models.

PURPOSE

To develop a logistic regression (LR) model for the prediction of complications following posterior lumbar spine surgery and to then assess for any difference in performance of the model when using the AUC versus the F1-score.

STUDY DESIGN

Retrospective review of a prospective cohort.

PATIENT SAMPLE

The American College of Surgeons National Surgical Quality Improvement Program (NSQIP) registry was used. All patients that underwent posterior lumbar spine surgery between 2005 to 2016 with appropriate data were included.

OUTCOME MEASURES

Both the AUC and F1-score were utilized to assess the prognostic performance of the prediction model.

METHODS

In order to develop the LR model used to predict a complication during or following spine surgery, 19 variables were selected by three orthopedic spine surgeons from the NSQIP registry. Two datasets were developed for this analysis: (1) an imbalanced dataset, which was taken directly from the NSQIP registry, and (2) a down-sampled set. The purpose of the down-sampled set was to balance the data in order to evaluate whether balancing the data had an effect on model performance. The AUC and F1-score were applied to both of these datasets.

RESULTS

Within the NSQIP database, 52,787 spine surgery cases were identified of which only 10% of these cases had complications during surgery. Applying the LR model showed a large difference between the AUC (0.69) and the F1 score (0.075) on the imbalanced dataset. However, no major differences existed between the AUC and F1-score when the data was balanced and the LR model was reapplied (0.69 and 0.62, AUC and F1-score, respectively).

CONCLUSIONS

The F1-score detected a drastically lower performance for the prediction of complications when using the imbalanced data, but detected a performance similar to the AUC level when balancing techniques were utilized for the dataset. This difference is due to a low precision score when many false positive classifications are present, which is not identified when using the AUC value. This lowers the utility of the AUC score, as many of the datasets used in medicine are imbalanced. Therefore, we recommend using the F1-score on large, prospective databases when the data is imbalanced with a large amount of true negative classifications.

摘要

背景

随着北美地区脊柱手术数量的增加，人们迫切需要能够准确预测哪些患者发生并发症风险较高的模型。然而，以前发表的方法使用大型多中心数据库来开发预测模型，这些方法依赖于接收器操作特征曲线及其相关曲线下面积（AUC）来评估模型的性能。最近发现，使用精度-召回率曲线及其相关 F1 分数可以为这些模型提供更现实的分析。

目的

建立用于预测后路腰椎脊柱手术后并发症的逻辑回归（LR）模型，并评估使用 AUC 与 F1 分数对模型性能的影响。

研究设计

前瞻性队列的回顾性研究。

患者样本

使用美国外科医师学会国家外科质量改进计划（NSQIP）登记处。纳入 2005 年至 2016 年间接受后路腰椎脊柱手术且数据完整的所有患者。

结局指标

使用 AUC 和 F1 分数评估预测模型的预后性能。

方法

为了开发用于预测脊柱手术后或手术期间发生并发症的 LR 模型，三位骨科脊柱外科医生从 NSQIP 登记处选择了 19 个变量。本分析采用了两个数据集：（1）直接取自 NSQIP 登记处的不平衡数据集，（2）降采样集。降采样集的目的是平衡数据，以评估数据平衡是否对模型性能产生影响。对这两个数据集应用 AUC 和 F1 分数。

结果

在 NSQIP 数据库中，确定了 52787 例脊柱手术病例，其中仅 10%的病例在手术过程中发生并发症。应用 LR 模型显示，在不平衡数据集上，AUC（0.69）和 F1 分数（0.075）之间存在较大差异。然而，当数据平衡且重新应用 LR 模型时，AUC 和 F1 分数之间没有明显差异（分别为 0.69 和 0.62）。

结论

当使用不平衡数据时，F1 分数检测到预测并发症的性能大幅下降，但当对数据集使用平衡技术时，F1 分数检测到与 AUC 水平相似的性能。这种差异是由于存在大量假阳性分类时精度得分较低，而 AUC 值并未识别出这种情况。这降低了 AUC 分数的实用性，因为医学中使用的许多数据集都是不平衡的。因此，当数据不平衡且存在大量真实阴性分类时，我们建议在大型前瞻性数据库上使用 F1 分数。

相似文献

Using a national surgical database to predict complications following posterior lumbar surgery and comparing the area under the curve and F1-score for the assessment of prognostic capability.利用国家外科手术数据库预测腰椎后路手术后的并发症，并比较曲线下面积和 F1 评分评估预测能力。

Spine J. 2021 Jul;21(7):1135-1142. doi: 10.1016/j.spinee.2021.02.007. Epub 2021 Feb 16.

Development of an unsupervised machine learning algorithm for the prognostication of walking ability in spinal cord injury patients.开发一种用于预测脊髓损伤患者行走能力的无监督机器学习算法。

Spine J. 2020 Feb;20(2):213-224. doi: 10.1016/j.spinee.2019.09.007. Epub 2019 Sep 13.

Can the American College of Surgeons Risk Calculator Predict 30-day Complications After Spine Surgery?美国外科医师学院风险计算器能否预测脊柱手术后 30 天的并发症？

Spine (Phila Pa 1976). 2020 May 1;45(9):621-628. doi: 10.1097/BRS.0000000000003340.

Examining the Ability of Artificial Neural Networks Machine Learning Models to Accurately Predict Complications Following Posterior Lumbar Spine Fusion.探讨人工神经网络机器学习模型准确预测后路腰椎融合术后并发症的能力。

Spine (Phila Pa 1976). 2018 Jun 15;43(12):853-860. doi: 10.1097/BRS.0000000000002442.

Predicting complication risk in spine surgery: a prospective analysis of a novel risk assessment tool.预测脊柱手术并发症风险：一种新型风险评估工具的前瞻性分析。

J Neurosurg Spine. 2017 Jul;27(1):81-91. doi: 10.3171/2016.12.SPINE16969. Epub 2017 Apr 21.

Development and validation of a risk-based algorithm for preoperative type and screen testing in spine surgery.脊柱手术术前血型及筛查检测基于风险的算法的开发与验证

Spine J. 2022 Sep;22(9):1472-1480. doi: 10.1016/j.spinee.2022.04.006. Epub 2022 Apr 19.

Development of a Risk Prediction Model With Improved Clinical Utility in Elective Cervical and Lumbar Spine Surgery.择期颈椎和腰椎手术中具有改善临床实用性的风险预测模型的开发。

Spine (Phila Pa 1976). 2020 May 1;45(9):E542-E551. doi: 10.1097/BRS.0000000000003317.

Development and validation of a novel scoring tool for predicting facility discharge after elective posterior lumbar fusion.开发并验证一种新的评分工具，用于预测择期后路腰椎融合术后的医疗机构出院情况。

Spine J. 2020 Oct;20(10):1629-1637. doi: 10.1016/j.spinee.2020.02.014. Epub 2020 Mar 2.

Accuracy of American College of Surgeons National Surgical Quality Improvement Program Universal Surgical Risk Calculator in Predicting Complications Following Robot-Assisted Radical Cystectomy at a National Comprehensive Cancer Center.美国外科医师学院国家外科质量改进计划通用手术风险计算器在预测国家综合癌症中心机器人辅助根治性膀胱切除术术后并发症中的准确性。

J Endourol. 2019 May;33(5):383-388. doi: 10.1089/end.2019.0093. Epub 2019 Apr 22.

A predictive model of complications after spine surgery: the National Surgical Quality Improvement Program (NSQIP) 2005-2010.脊柱手术后并发症的预测模型：国家手术质量改进计划（NSQIP）2005-2010 年。

Spine J. 2014 Jul 1;14(7):1247-55. doi: 10.1016/j.spinee.2013.08.009. Epub 2013 Oct 4.

引用本文的文献

A machine learning model for predicting fertilization following short-term insemination using embryo images.一种使用胚胎图像预测短期授精后受精情况的机器学习模型。

Reprod Med Biol. 2025 Apr 15;24(1):e12649. doi: 10.1002/rmb2.12649. eCollection 2025 Jan-Dec.

Predicting diabetes mellitus metabolic goals and chronic complications transitions-analysis based on natural language processing and machine learning models.基于自然语言处理和机器学习模型预测糖尿病代谢目标及慢性并发症的转换分析

PLoS One. 2025 Apr 15;20(4):e0321258. doi: 10.1371/journal.pone.0321258. eCollection 2025.

Radiomic signatures of brain metastases on MRI: utility in predicting pathological subtypes of lung cancer.MRI上脑转移瘤的影像组学特征：在预测肺癌病理亚型中的应用

Transl Cancer Res. 2024 Dec 31;13(12):6825-6836. doi: 10.21037/tcr-24-1147. Epub 2024 Dec 17.

Advancing smart city factories: enhancing industrial mechanical operations via deep learning techniques.推进智慧城市工厂：通过深度学习技术提升工业机械操作

Front Artif Intell. 2024 Nov 6;7:1398126. doi: 10.3389/frai.2024.1398126. eCollection 2024.

Current Applications and Future Implications of Artificial Intelligence in Spine Surgery and Research: A Narrative Review and Commentary.人工智能在脊柱外科手术与研究中的当前应用及未来影响：一项叙述性综述与评论

Global Spine J. 2025 Mar;15(2):1445-1454. doi: 10.1177/21925682241290752. Epub 2024 Oct 2.

Development of a Tremor Detection Algorithm for Use in an Academic Movement Disorders Center.用于学术运动障碍中心的震颤检测算法的开发。

Sensors (Basel). 2024 Jul 31;24(15):4960. doi: 10.3390/s24154960.

Detection of freezing of gait in Parkinson's disease from foot-pressure sensing insoles using a temporal convolutional neural network.使用时间卷积神经网络通过足底压力感应鞋垫检测帕金森病中的步态冻结

Front Aging Neurosci. 2024 Jul 18;16:1437707. doi: 10.3389/fnagi.2024.1437707. eCollection 2024.

Construction of a pathway-level model for preeclampsia based on gene expression data.基于基因表达数据构建子痫前期的通路水平模型。

Hypertens Res. 2024 Sep;47(9):2521-2531. doi: 10.1038/s41440-024-01753-0. Epub 2024 Jun 24.

Non-invasive biomarkers for detecting progression toward hypovolemic cardiovascular instability in a lower body negative pressure model.用于检测下体负压模型中低血容量性心血管不稳定进展的无创生物标志物。

Sci Rep. 2024 Apr 15;14(1):8719. doi: 10.1038/s41598-024-59139-8.

Artificial intelligence applied to magnetic resonance imaging reliably detects the presence, but not the location, of meniscus tears: a systematic review and meta-analysis.人工智能应用于磁共振成像可靠地检测到半月板撕裂的存在，但不能确定其位置：系统评价和荟萃分析。

Eur Radiol. 2024 Sep;34(9):5954-5964. doi: 10.1007/s00330-024-10625-7. Epub 2024 Feb 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用国家外科手术数据库预测腰椎后路手术后的并发症，并比较曲线下面积和 F1 评分评估预测能力。

Using a national surgical database to predict complications following posterior lumbar surgery and comparing the area under the curve and F1-score for the assessment of prognostic capability.

机构信息

出版信息

BACKGROUND

PURPOSE

STUDY DESIGN

PATIENT SAMPLE

OUTCOME MEASURES

METHODS

RESULTS

CONCLUSIONS

背景

目的

研究设计

患者样本

结局指标

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献