统计模型与机器学习在竞争风险中的应用：预后模型的建立与验证。

Statistical models versus machine learning for competing risks: development and validation of prognostic models.

机构信息

Mathematical Institute (MI) Leiden University, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands.

Department of Biomedical Data Sciences, Section Medical Statistics, Leiden University Medical Center (LUMC), Albinusdreef 2, 2333 ZA, Leiden, The Netherlands.

出版信息

BMC Med Res Methodol. 2023 Feb 24;23(1):51. doi: 10.1186/s12874-023-01866-z.

DOI:10.1186/s12874-023-01866-z

PMID:36829145

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9951458/

Abstract

BACKGROUND

In health research, several chronic diseases are susceptible to competing risks (CRs). Initially, statistical models (SM) were developed to estimate the cumulative incidence of an event in the presence of CRs. As recently there is a growing interest in applying machine learning (ML) for clinical prediction, these techniques have also been extended to model CRs but literature is limited. Here, our aim is to investigate the potential role of ML versus SM for CRs within non-complex data (small/medium sample size, low dimensional setting).

METHODS

A dataset with 3826 retrospectively collected patients with extremity soft-tissue sarcoma (eSTS) and nine predictors is used to evaluate model-predictive performance in terms of discrimination and calibration. Two SM (cause-specific Cox, Fine-Gray) and three ML techniques are compared for CRs in a simple clinical setting. ML models include an original partial logistic artificial neural network for CRs (PLANNCR original), a PLANNCR with novel specifications in terms of architecture (PLANNCR extended), and a random survival forest for CRs (RSFCR). The clinical endpoint is the time in years between surgery and disease progression (event of interest) or death (competing event). Time points of interest are 2, 5, and 10 years.

RESULTS

Based on the original eSTS data, 100 bootstrapped training datasets are drawn. Performance of the final models is assessed on validation data (left out samples) by employing as measures the Brier score and the Area Under the Curve (AUC) with CRs. Miscalibration (absolute accuracy error) is also estimated. Results show that the ML models are able to reach a comparable performance versus the SM at 2, 5, and 10 years regarding both Brier score and AUC (95% confidence intervals overlapped). However, the SM are frequently better calibrated.

CONCLUSIONS

Overall, ML techniques are less practical as they require substantial implementation time (data preprocessing, hyperparameter tuning, computational intensity), whereas regression methods can perform well without the additional workload of model training. As such, for non-complex real life survival data, these techniques should only be applied complementary to SM as exploratory tools of model's performance. More attention to model calibration is urgently needed.

摘要

背景

在健康研究中，几种慢性病易受竞争风险（CRs）的影响。最初，统计模型（SM）被开发用于在存在 CRs 的情况下估计事件的累积发生率。由于最近人们对应用机器学习（ML）进行临床预测越来越感兴趣，这些技术也已扩展到 CRs 模型，但文献有限。在这里，我们的目的是研究 ML 与 SM 在非复杂数据（小/中样本量，低维设置）中对 CRs 的潜在作用。

方法

使用包含 3826 名回顾性收集的四肢软组织肉瘤（eSTS）患者和 9 个预测因子的数据集，根据区分度和校准度评估模型预测性能。在简单的临床环境中，比较了两种 SM（原因特异性 Cox、Fine-Gray）和三种 ML 技术对 CRs 的预测。ML 模型包括用于 CRs 的原始部分逻辑人工神经网络（PLANNCR 原始）、在架构方面具有新规格的 PLANNCR（PLANNCR 扩展）和用于 CRs 的随机生存森林（RSFCR）。临床终点是手术和疾病进展（感兴趣事件）或死亡（竞争事件）之间的年时间。感兴趣的时间点为 2、5 和 10 年。

结果

基于原始 eSTS 数据，绘制了 100 个 bootstrap 训练数据集。通过使用 Brier 评分和曲线下面积（AUC）与 CRs 评估最终模型在验证数据（排除样本）上的性能。还估计了校准误差（绝对精度误差）。结果表明，在 2、5 和 10 年时，ML 模型在 Brier 评分和 AUC（95%置信区间重叠）方面能够达到与 SM 相当的性能。然而，SM 通常更能校准。

结论

总体而言，ML 技术的实用性较差，因为它们需要大量的实施时间（数据预处理、超参数调整、计算强度），而回归方法可以在没有模型训练额外工作量的情况下表现良好。因此，对于非复杂的实际生存数据，这些技术仅应作为 SM 的补充，作为模型性能的探索工具。迫切需要更多关注模型校准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d83/9951458/9ded4d4c55c6/12874_2023_1866_Fig1_HTML.jpg

相似文献

Statistical models versus machine learning for competing risks: development and validation of prognostic models.统计模型与机器学习在竞争风险中的应用：预后模型的建立与验证。

BMC Med Res Methodol. 2023 Feb 24;23(1):51. doi: 10.1186/s12874-023-01866-z.

Machine Learning Did Not Outperform Conventional Competing Risk Modeling to Predict Revision Arthroplasty.在预测翻修关节成形术方面，机器学习的表现并未优于传统的竞争风险模型。

Clin Orthop Relat Res. 2024 Aug 1;482(8):1472-1482. doi: 10.1097/CORR.0000000000003018. Epub 2024 Mar 12.

A Simulation Study to Compare the Predictive Performance of Survival Neural Networks with Cox Models for Clinical Trial Data.一项比较生存神经网络和 Cox 模型对临床试验数据预测性能的仿真研究。

Comput Math Methods Med. 2021 Nov 28;2021:2160322. doi: 10.1155/2021/2160322. eCollection 2021.

Survival prediction models since liver transplantation - comparisons between Cox models and machine learning techniques.肝移植后生存预测模型——Cox 模型与机器学习技术的比较。

BMC Med Res Methodol. 2020 Nov 16;20(1):277. doi: 10.1186/s12874-020-01153-1.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者？

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在（放化疗）治疗结果预测中的应用：分类器的实证比较。

Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.

Dementia risk prediction in individuals with mild cognitive impairment: a comparison of Cox regression and machine learning models.轻度认知障碍个体的痴呆风险预测：Cox 回归和机器学习模型的比较。

BMC Med Res Methodol. 2022 Nov 2;22(1):284. doi: 10.1186/s12874-022-01754-y.

Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study.统计和机器学习模型在乳腺癌预后预测中的开发和内外验证：队列研究。

BMJ. 2023 May 10;381:e073800. doi: 10.1136/bmj-2022-073800.

How Does the Skeletal Oncology Research Group Algorithm's Prediction of 5-year Survival in Patients with Chondrosarcoma Perform on International Validation?骨肿瘤研究组算法对软骨肉瘤患者 5 年生存率的预测在国际验证中的表现如何？

Clin Orthop Relat Res. 2020 Oct;478(10):2300-2308. doi: 10.1097/CORR.0000000000001305.

Comparison of time-to-event machine learning models in predicting oral cavity cancer prognosis.比较时间事件机器学习模型在预测口腔癌预后中的应用。

Int J Med Inform. 2022 Jan;157:104635. doi: 10.1016/j.ijmedinf.2021.104635. Epub 2021 Nov 14.

引用本文的文献

Use of machine learning to predict creativity among nurses: a multidisciplinary approach.运用机器学习预测护士的创造力：一种多学科方法。

BMC Nurs. 2025 May 15;24(1):539. doi: 10.1186/s12912-025-03151-4.

Machine learning-based survival models for predicting rehospitalization of older hip fracture patients: a retrospective cohort study.基于机器学习的老年髋部骨折患者再住院预测生存模型：一项回顾性队列研究。

BMC Musculoskelet Disord. 2025 May 8;26(1):451. doi: 10.1186/s12891-025-08710-z.

Artificial Intelligence in bone Metastases: A systematic review in guideline adherence of 92 studies.人工智能在骨转移中的应用：对92项研究指南依从性的系统评价

J Bone Oncol. 2025 Apr 24;52:100682. doi: 10.1016/j.jbo.2025.100682. eCollection 2025 Jun.

When the whole is greater than the sum of its parts: why machine learning and conventional statistics are complementary for predicting future health outcomes.当整体大于部分之和：为何机器学习与传统统计学在预测未来健康结果方面相辅相成。

Clin Kidney J. 2025 Feb 20;18(4):sfaf059. doi: 10.1093/ckj/sfaf059. eCollection 2025 Apr.

Competing and Noncompeting Risk Models for Predicting Kidney Allograft Failure.预测肾移植失败的竞争风险模型和非竞争风险模型

J Am Soc Nephrol. 2025 Apr 1;36(4):688-701. doi: 10.1681/ASN.0000000517. Epub 2024 Oct 16.

Statistical models versus machine learning approach for competing risks in proctological surgery.直肠外科手术中竞争风险的统计模型与机器学习方法

Updates Surg. 2025 Apr;77(2):333-341. doi: 10.1007/s13304-025-02109-0. Epub 2025 Jan 25.

Risk factor-targeted abdominal aortic aneurysm screening: systematic review of risk prediction for abdominal aortic aneurysm.基于风险因素的腹主动脉瘤筛查：腹主动脉瘤风险预测的系统评价。

Br J Surg. 2024 Aug 30;111(9). doi: 10.1093/bjs/znae239.

Integrating Omics Data and AI for Cancer Diagnosis and Prognosis.整合组学数据与人工智能用于癌症诊断和预后评估

Cancers (Basel). 2024 Jul 3;16(13):2448. doi: 10.3390/cancers16132448.

A Critical Review of Data Science Applications in Resource Recovery and Carbon Capture from Organic Waste.关于有机废物资源回收与碳捕获中数据科学应用的批判性综述

ACS ES T Eng. 2023 Sep 29;3(10):1424-1467. doi: 10.1021/acsestengg.3c00043. eCollection 2023 Oct 13.

本文引用的文献

Comput Math Methods Med. 2021 Nov 28;2021:2160322. doi: 10.1155/2021/2160322. eCollection 2021.

Reporting of prognostic clinical prediction models based on machine learning methods in oncology needs to be improved.基于机器学习方法的肿瘤预后临床预测模型报告需要改进。

J Clin Epidemiol. 2021 Oct;138:60-72. doi: 10.1016/j.jclinepi.2021.06.024. Epub 2021 Jun 29.

Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data With Competing Risks.深度生存机器：带竞争风险的删失数据的完全参数生存回归和表示学习。

IEEE J Biomed Health Inform. 2021 Aug;25(8):3163-3175. doi: 10.1109/JBHI.2021.3052441. Epub 2021 Aug 5.

External validation and adaptation of a dynamic prediction model for patients with high-grade extremity soft tissue sarcoma.高级肢体软组织肉瘤患者动态预测模型的外部验证和适应性调整。

J Surg Oncol. 2021 Mar;123(4):1050-1056. doi: 10.1002/jso.26337. Epub 2020 Dec 17.

Survival prediction models since liver transplantation - comparisons between Cox models and machine learning techniques.肝移植后生存预测模型——Cox 模型与机器学习技术的比较。

BMC Med Res Methodol. 2020 Nov 16;20(1):277. doi: 10.1186/s12874-020-01153-1.

Calculating the sample size required for developing a clinical prediction model.计算开发临床预测模型所需的样本量。

BMJ. 2020 Mar 18;368:m441. doi: 10.1136/bmj.m441.

Why we should take care of the competing risk bias in survival analysis: A phase II trial on the toxicity profile of radiotherapy for prostate cancer.为什么我们应该关注生存分析中的竞争风险偏倚：一项关于前列腺癌放射治疗毒性特征的II期试验。

Rep Pract Oncol Radiother. 2019 Nov-Dec;24(6):511-519. doi: 10.1016/j.rpor.2019.08.001. Epub 2019 Aug 19.

Reporting of artificial intelligence prediction models.人工智能预测模型的报告。

Lancet. 2019 Apr 20;393(10181):1577-1579. doi: 10.1016/S0140-6736(19)30037-6.

Machine learning in medicine: a practical introduction.医学中的机器学习：实用入门

BMC Med Res Methodol. 2019 Mar 19;19(1):64. doi: 10.1186/s12874-019-0681-4.

Applications of artificial neural networks in health care organizational decision-making: A scoping review.人工神经网络在医疗保健组织决策中的应用：范围综述。

PLoS One. 2019 Feb 19;14(2):e0212356. doi: 10.1371/journal.pone.0212356. eCollection 2019.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

统计模型与机器学习在竞争风险中的应用：预后模型的建立与验证。

Statistical models versus machine learning for competing risks: development and validation of prognostic models.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献