• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

平衡准确性与可解释性:一个用于评估超越Cox模型的复杂关系并应用于临床预测的R包。

Balancing accuracy and Interpretability: An R package assessing complex relationships beyond the Cox model and applications to clinical prediction.

作者信息

Shamsutdinova Diana, Stamate Daniel, Stahl Daniel

机构信息

Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom.

Data Science and Soft Computing Lab, Computing Department, Goldsmiths University of London, United Kingdom; School of Health Sciences, University of Manchester, Manchester, United Kingdom.

出版信息

Int J Med Inform. 2025 Feb;194:105700. doi: 10.1016/j.ijmedinf.2024.105700. Epub 2024 Nov 10.

DOI:10.1016/j.ijmedinf.2024.105700
PMID:39546831
Abstract

BACKGROUND

Accurate and interpretable models are essential for clinical decision-making, where predictions can directly impact patient care. Machine learning (ML) survival methods can handle complex multidimensional data and achieve high accuracy but require post-hoc explanations. Traditional models such as the Cox Proportional Hazards Model (Cox-PH) are less flexible, but fast, stable, and intrinsically transparent. Moreover, ML does not always outperform Cox-PH in clinical settings, warranting a diligent model validation. We aimed to develop a set of R functions to help explore the limits of Cox-PH compared to the tree-based and deep learning survival models for clinical prediction modelling, employing ensemble learning and nested cross-validation.

METHODS

We developed a set of R functions, publicly available as the package "survcompare". It supports Cox-PH and Cox-Lasso, and Survival Random Forest (SRF) and DeepHit are the ML alternatives, along with the ensemble methods integrating Cox-PH with SRF or DeepHit designed to isolate the marginal value of ML. The package performs a repeated nested cross-validation and tests for statistical significance of the ML's superiority using the survival-specific performance metrics, the concordance index, time-dependent AUC-ROC and calibration slope. To get practical insights, we applied this methodology to clinical and simulated datasets with varying complexities and sizes.

RESULTS

In simulated data with non-linearities or interactions, ML models outperformed Cox-PH at sample sizes ≥ 500. ML superiority was also observed in imaging and high-dimensional clinical data. However, for tabular clinical data, the performance gains of ML were minimal; in some cases, regularised Cox-Lasso recovered much of the ML's performance advantage with significantly faster computations. Ensemble methods combining Cox-PH and ML predictions were instrumental in quantifying Cox-PH's limits and improving ML calibration. Traditional models like Cox-PH or Cox-Lasso should not be overlooked while developing clinical predictive models from tabular data or data of limited size.

CONCLUSION

Our package offers researchers a framework and practical tool for evaluating the accuracy-interpretability trade-off, helping make informed decisions about model selection.

摘要

背景

准确且可解释的模型对于临床决策至关重要,因为预测结果会直接影响患者护理。机器学习(ML)生存方法能够处理复杂的多维数据并实现高精度,但需要事后解释。像Cox比例风险模型(Cox-PH)这样的传统模型灵活性较差,但速度快、稳定性好且本质上具有透明度。此外,在临床环境中,ML并不总是优于Cox-PH,因此需要进行严格的模型验证。我们旨在开发一组R函数,通过集成学习和嵌套交叉验证,帮助探索Cox-PH与基于树的和深度学习生存模型相比在临床预测建模中的局限性。

方法

我们开发了一组R函数,以“survcompare”包的形式公开提供。它支持Cox-PH和Cox-Lasso,ML替代方法包括生存随机森林(SRF)和深度命中(DeepHit),以及将Cox-PH与SRF或DeepHit集成的集成方法,旨在分离ML的边际价值。该包执行重复的嵌套交叉验证,并使用特定于生存的性能指标、一致性指数、时间依赖的AUC-ROC和校准斜率来测试ML优越性的统计显著性。为了获得实际见解,我们将这种方法应用于具有不同复杂性和规模的临床和模拟数据集。

结果

在具有非线性或交互作用的模拟数据中,样本量≥500时,ML模型优于Cox-PH。在成像和高维临床数据中也观察到了ML优越性。然而,对于表格临床数据,ML的性能提升很小;在某些情况下,正则化的Cox-Lasso以显著更快的计算速度恢复了ML的许多性能优势。结合Cox-PH和ML预测的集成方法有助于量化Cox-PH的局限性并改善ML校准。在从表格数据或规模有限的数据开发临床预测模型时,不应忽视像Cox-PH或Cox-Lasso这样的传统模型。

结论

我们的包为研究人员提供了一个评估准确性与可解释性权衡的框架和实用工具,有助于在模型选择方面做出明智决策。

相似文献

1
Balancing accuracy and Interpretability: An R package assessing complex relationships beyond the Cox model and applications to clinical prediction.平衡准确性与可解释性:一个用于评估超越Cox模型的复杂关系并应用于临床预测的R包。
Int J Med Inform. 2025 Feb;194:105700. doi: 10.1016/j.ijmedinf.2024.105700. Epub 2024 Nov 10.
2
Towards clinical prediction with transparency: An explainable AI approach to survival modelling in residential aged care.迈向具有透明度的临床预测:一种用于老年护理机构生存建模的可解释人工智能方法。
Comput Methods Programs Biomed. 2025 May;263:108653. doi: 10.1016/j.cmpb.2025.108653. Epub 2025 Feb 15.
3
Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study.基于生存事件的机器学习预测结直肠癌患者生存情况:回顾性队列研究。
J Med Internet Res. 2023 Oct 26;25:e44417. doi: 10.2196/44417.
4
Deep survival analysis for interpretable time-varying prediction of preeclampsia risk.深度生存分析可用于可解释的子痫前期风险随时间变化的预测。
J Biomed Inform. 2024 Aug;156:104688. doi: 10.1016/j.jbi.2024.104688. Epub 2024 Jul 11.
5
Survival prediction models: an introduction to discrete-time modeling.生存预测模型:离散时间建模简介。
BMC Med Res Methodol. 2022 Jul 26;22(1):207. doi: 10.1186/s12874-022-01679-6.
6
Development and internal validation of machine learning models for personalized survival predictions in spinal cord glioma patients.机器学习模型在脊髓神经胶质瘤患者个体化生存预测中的开发和内部验证。
Spine J. 2024 Jun;24(6):1065-1076. doi: 10.1016/j.spinee.2024.02.002. Epub 2024 Feb 15.
7
Dementia risk prediction in individuals with mild cognitive impairment: a comparison of Cox regression and machine learning models.轻度认知障碍个体的痴呆风险预测:Cox 回归和机器学习模型的比较。
BMC Med Res Methodol. 2022 Nov 2;22(1):284. doi: 10.1186/s12874-022-01754-y.
8
Data-driven survival modeling for breast cancer prognostics: A comparative study with machine learning and traditional survival modeling methods.用于乳腺癌预后的数据驱动生存建模:与机器学习和传统生存建模方法的比较研究。
PLoS One. 2025 Apr 22;20(4):e0318167. doi: 10.1371/journal.pone.0318167. eCollection 2025.
9
Construction of a random survival forest model based on a machine learning algorithm to predict early recurrence after hepatectomy for adult hepatocellular carcinoma.基于机器学习算法构建随机生存森林模型以预测成人肝细胞癌肝切除术后的早期复发。
BMC Cancer. 2024 Dec 25;24(1):1575. doi: 10.1186/s12885-024-13366-4.
10
Interpretable lung cancer risk prediction using ensemble learning and XAI based on lifestyle and demographic data.基于生活方式和人口统计学数据,使用集成学习和可解释人工智能进行可解释的肺癌风险预测。
Comput Biol Chem. 2025 Aug;117:108438. doi: 10.1016/j.compbiolchem.2025.108438. Epub 2025 Mar 27.

引用本文的文献

1
Machine Learning in Myasthenia Gravis: A Systematic Review of Prognostic Models and AI-Assisted Clinical Assessments.重症肌无力中的机器学习:预后模型与人工智能辅助临床评估的系统评价
Diagnostics (Basel). 2025 Aug 14;15(16):2044. doi: 10.3390/diagnostics15162044.
2
Machine learning for predicting all-cause mortality of metabolic dysfunction-associated fatty liver disease: a longitudinal study based on NHANES.用于预测代谢功能障碍相关脂肪性肝病全因死亡率的机器学习:一项基于美国国家健康与营养检查调查(NHANES)的纵向研究
BMC Gastroenterol. 2025 May 15;25(1):376. doi: 10.1186/s12876-025-03946-4.