一项范围性方法学综述：比较统计方法和机器学习方法对事件发生时间数据进行风险预测的模拟研究。

A scoping methodological review of simulation studies comparing statistical and machine learning approaches to risk prediction for time-to-event data.

作者信息

Smith Hayley, Sweeting Michael, Morris Tim, Crowther Michael J

机构信息

Department of Health Sciences, University of Leicester, Leicester, LE1 7RH, UK.

Statistical Innovation, Oncology Biometrics, Oncology R&D, AstraZeneca, Cambridge, UK.

出版信息

Diagn Progn Res. 2022 Jun 2;6(1):10. doi: 10.1186/s41512-022-00124-y.

DOI:10.1186/s41512-022-00124-y

PMID:35650647

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9161606/

Abstract

BACKGROUND

There is substantial interest in the adaptation and application of so-called machine learning approaches to prognostic modelling of censored time-to-event data. These methods must be compared and evaluated against existing methods in a variety of scenarios to determine their predictive performance. A scoping review of how machine learning methods have been compared to traditional survival models is important to identify the comparisons that have been made and issues where they are lacking, biased towards one approach or misleading.

METHODS

We conducted a scoping review of research articles published between 1 January 2000 and 2 December 2020 using PubMed. Eligible articles were those that used simulation studies to compare statistical and machine learning methods for risk prediction with a time-to-event outcome in a medical/healthcare setting. We focus on data-generating mechanisms (DGMs), the methods that have been compared, the estimands of the simulation studies, and the performance measures used to evaluate them.

RESULTS

A total of ten articles were identified as eligible for the review. Six of the articles evaluated a method that was developed by the authors, four of which were machine learning methods, and the results almost always stated that this developed method's performance was equivalent to or better than the other methods compared. Comparisons were often biased towards the novel approach, with the majority only comparing against a basic Cox proportional hazards model, and in scenarios where it is clear it would not perform well. In many of the articles reviewed, key information was unclear, such as the number of simulation repetitions and how performance measures were calculated.

CONCLUSION

It is vital that method comparisons are unbiased and comprehensive, and this should be the goal even if realising it is difficult. Fully assessing how newly developed methods perform and how they compare to a variety of traditional statistical methods for prognostic modelling is imperative as these methods are already being applied in clinical contexts. Evaluations of the performance and usefulness of recently developed methods for risk prediction should be continued and reporting standards improved as these methods become increasingly popular.

摘要

背景

所谓的机器学习方法在截尾事件发生时间数据的预后建模中的适应性和应用受到了广泛关注。必须在各种场景下将这些方法与现有方法进行比较和评估，以确定它们的预测性能。对机器学习方法与传统生存模型的比较方式进行范围综述，对于识别已进行的比较以及它们存在不足、偏向一种方法或具有误导性的问题很重要。

方法

我们使用PubMed对2000年1月1日至2020年12月2日发表的研究文章进行了范围综述。符合条件的文章是那些使用模拟研究在医疗/卫生保健环境中比较统计和机器学习方法用于风险预测并以事件发生时间作为结果的文章。我们关注数据生成机制（DGM）、被比较的方法、模拟研究的估计量以及用于评估它们的性能指标。

结果

总共确定了10篇文章符合综述条件。其中6篇文章评估了作者开发的一种方法，其中4种是机器学习方法，结果几乎总是表明这种开发的方法的性能等同于或优于所比较的其他方法。比较往往偏向新方法，大多数只与基本的Cox比例风险模型进行比较，并且是在明显该模型表现不佳的场景下。在许多被综述的文章中，关键信息不明确，例如模拟重复的次数以及性能指标是如何计算的。

结论

方法比较必须无偏且全面，即使实现这一点很困难，这也应该是目标。全面评估新开发的方法的性能以及它们与各种传统统计方法在预后建模方面的比较情况至关重要，因为这些方法已经在临床环境中得到应用。随着这些方法越来越受欢迎，应继续对最近开发的风险预测方法的性能和实用性进行评估，并改进报告标准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91ad/9161606/d486c4365fec/41512_2022_124_Fig1_HTML.jpg

相似文献

A scoping methodological review of simulation studies comparing statistical and machine learning approaches to risk prediction for time-to-event data.一项范围性方法学综述：比较统计方法和机器学习方法对事件发生时间数据进行风险预测的模拟研究。

Diagn Progn Res. 2022 Jun 2;6(1):10. doi: 10.1186/s41512-022-00124-y.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

The future of Cochrane Neonatal.考克兰新生儿协作网的未来。

Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.基于数据驱动的血糖动力学建模与预测：机器学习在 1 型糖尿病中的应用。

Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26.

Survival prediction models: an introduction to discrete-time modeling.生存预测模型：离散时间建模简介。

BMC Med Res Methodol. 2022 Jul 26;22(1):207. doi: 10.1186/s12874-022-01679-6.

Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review.基于机器学习的肿瘤预后预测模型的方法学研究：系统评价。

BMC Med Res Methodol. 2022 Apr 8;22(1):101. doi: 10.1186/s12874-022-01577-x.

Machine Learning Did Not Outperform Conventional Competing Risk Modeling to Predict Revision Arthroplasty.在预测翻修关节成形术方面，机器学习的表现并未优于传统的竞争风险模型。

Clin Orthop Relat Res. 2024 Aug 1;482(8):1472-1482. doi: 10.1097/CORR.0000000000003018. Epub 2024 Mar 12.

Prognostic models for newly-diagnosed chronic lymphocytic leukaemia in adults: a systematic review and meta-analysis.成人新诊断慢性淋巴细胞白血病的预后模型：一项系统评价和荟萃分析。

Cochrane Database Syst Rev. 2020 Jul 31;7(7):CD012022. doi: 10.1002/14651858.CD012022.pub2.

The Application and Comparison of Machine Learning Models for the Prediction of Breast Cancer Prognosis: Retrospective Cohort Study.机器学习模型在乳腺癌预后预测中的应用与比较：回顾性队列研究

JMIR Med Inform. 2022 Feb 18;10(2):e33440. doi: 10.2196/33440.

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review.基于监督机器学习技术开发的预测模型研究中的偏倚风险：系统评价。

BMJ. 2021 Oct 20;375:n2281. doi: 10.1136/bmj.n2281.

引用本文的文献

Machine learning algorithms to predict the risk of admission to intensive care units in HIV-infected individuals: a single-centre study.预测HIV感染者入住重症监护病房风险的机器学习算法：一项单中心研究。

Virol J. 2025 Aug 5;22(1):267. doi: 10.1186/s12985-025-02900-w.

Developing clinical prognostic models to predict graft survival after renal transplantation: comparison of statistical and machine learning models.开发临床预后模型以预测肾移植后的移植物存活：统计模型与机器学习模型的比较

BMC Med Inform Decis Mak. 2025 Feb 3;25(1):54. doi: 10.1186/s12911-025-02906-y.

Clinical Predictive Modeling of Heart Failure: Domain Description, Models' Characteristics and Literature Review.心力衰竭的临床预测建模：领域描述、模型特征及文献综述

Diagnostics (Basel). 2024 Feb 17;14(4):443. doi: 10.3390/diagnostics14040443.

Risk factors affecting patients survival with colorectal cancer in Morocco: survival analysis using an interpretable machine learning approach.影响摩洛哥结直肠癌患者生存的风险因素：使用可解释机器学习方法的生存分析。

Sci Rep. 2024 Feb 12;14(1):3556. doi: 10.1038/s41598-024-51304-3.

Comparing the performance of statistical, machine learning, and deep learning algorithms to predict time-to-event: A simulation study for conversion to mild cognitive impairment.比较统计、机器学习和深度学习算法在预测事件时间方面的性能：转化为轻度认知障碍的模拟研究。

PLoS One. 2024 Jan 22;19(1):e0297190. doi: 10.1371/journal.pone.0297190. eCollection 2024.

A systematic review of simulation studies which compare existing statistical methods to account for non-compliance in randomised controlled trials.系统评价比较了随机对照试验中现有统计方法以解决不依从性问题的模拟研究。

BMC Med Res Methodol. 2023 Dec 16;23(1):300. doi: 10.1186/s12874-023-02126-w.

Dementia risk prediction in individuals with mild cognitive impairment: a comparison of Cox regression and machine learning models.轻度认知障碍个体的痴呆风险预测：Cox 回归和机器学习模型的比较。

BMC Med Res Methodol. 2022 Nov 2;22(1):284. doi: 10.1186/s12874-022-01754-y.

Interactions in the 2×2×2 factorial randomised clinical STEPCARE trial and the potential effects on conclusions: a protocol for a simulation study.2×2×2 析因随机临床试验 STEPCARE 中的交互作用及其对结论的潜在影响：一项模拟研究方案。

Trials. 2022 Oct 22;23(1):889. doi: 10.1186/s13063-022-06796-7.

本文引用的文献

The roles of predictors in cardiovascular risk models - a question of modeling culture?预测因子在心血管风险模型中的作用——是建模文化的问题吗？

BMC Med Res Methodol. 2021 Dec 18;21(1):284. doi: 10.1186/s12874-021-01487-4.

The false hope of current approaches to explainable artificial intelligence in health care.当前医疗保健中可解释人工智能方法的虚假希望。

Lancet Digit Health. 2021 Nov;3(11):e745-e750. doi: 10.1016/S2589-7500(21)00208-9.

Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence.基于人工智能的诊断和预后预测模型研究报告指南（TRIPOD-AI）和偏倚风险工具（PROBAST-AI）制定方案。

BMJ Open. 2021 Jul 9;11(7):e048008. doi: 10.1136/bmjopen-2020-048008.

Predictive performance of machine and statistical learning methods: Impact of data-generating processes on external validity in the "large N, small p" setting.机器学习和统计学习方法的预测性能：在“大数据量、小样本量”设置下，数据生成过程对外部有效性的影响。

Stat Methods Med Res. 2021 Jun;30(6):1465-1483. doi: 10.1177/09622802211002867. Epub 2021 Apr 13.

Machine learning prediction models in orthopedic surgery: A systematic review in transparent reporting.骨科手术中的机器学习预测模型：透明报告中的系统评价

J Orthop Res. 2022 Feb;40(2):475-483. doi: 10.1002/jor.25036. Epub 2021 Mar 29.

Reviewing the use and quality of machine learning in developing clinical prediction models for cardiovascular disease.回顾机器学习在开发心血管疾病临床预测模型中的使用和质量。

Postgrad Med J. 2022 Jul;98(1161):551-558. doi: 10.1136/postgradmedj-2020-139352. Epub 2021 Mar 10.

Survival prediction models since liver transplantation - comparisons between Cox models and machine learning techniques.肝移植后生存预测模型——Cox 模型与机器学习技术的比较。

BMC Med Res Methodol. 2020 Nov 16;20(1):277. doi: 10.1186/s12874-020-01153-1.

Recommendations for Reporting Machine Learning Analyses in Clinical Research.机器学习分析在临床研究中的报告建议。

Circ Cardiovasc Qual Outcomes. 2020 Oct;13(10):e006556. doi: 10.1161/CIRCOUTCOMES.120.006556. Epub 2020 Oct 14.

Comparison of Conventional Statistical Methods with Machine Learning in Medicine: Diagnosis, Drug Development, and Treatment.医学中传统统计方法与机器学习的比较：诊断、药物研发与治疗

Medicina (Kaunas). 2020 Sep 8;56(9):455. doi: 10.3390/medicina56090455.

Deep learning for survival outcomes.用于生存结果的深度学习。

Stat Med. 2020 Jul 30;39(17):2339-2349. doi: 10.1002/sim.8542. Epub 2020 Apr 13.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一项范围性方法学综述：比较统计方法和机器学习方法对事件发生时间数据进行风险预测的模拟研究。

A scoping methodological review of simulation studies comparing statistical and machine learning approaches to risk prediction for time-to-event data.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献