• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无惩罚和有惩罚逻辑回归以及基于集成的机器学习方法的相对数据需求量:校准的案例

The relative data hungriness of unpenalized and penalized logistic regression and ensemble-based machine learning methods: the case of calibration.

作者信息

Austin Peter C, Lee Douglas S, Wang Bo

机构信息

ICES, V106, 2075 Bayview Avenue, Toronto, ON, M4N 3M5, Canada.

Department of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada.

出版信息

Diagn Progn Res. 2024 Nov 5;8(1):15. doi: 10.1186/s41512-024-00179-z.

DOI:10.1186/s41512-024-00179-z
PMID:39501360
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11539735/
Abstract

BACKGROUND

Machine learning methods are increasingly being used to predict clinical outcomes. Optimism is the difference in model performance between derivation and validation samples. The term "data hungriness" refers to the sample size needed for a modelling technique to generate a prediction model with minimal optimism. Our objective was to compare the relative data hungriness of different statistical and machine learning methods when assessed using calibration.

METHODS

We used Monte Carlo simulations to assess the effect of number of events per variable (EPV) on the optimism of six learning methods when assessing model calibration: unpenalized logistic regression, ridge regression, lasso regression, bagged classification trees, random forests, and stochastic gradient boosting machines using trees as the base learners. We performed simulations in two large cardiovascular datasets each of which comprised an independent derivation and validation sample: patients hospitalized with acute myocardial infarction and patients hospitalized with heart failure. We used six data-generating processes, each based on one of the six learning methods. We allowed the sample sizes to be such that the number of EPV ranged from 10 to 200 in increments of 10. We applied six prediction methods in each of the simulated derivation samples and evaluated calibration in the simulated validation samples using the integrated calibration index, the calibration intercept, and the calibration slope. We also examined Nagelkerke's R, the scaled Brier score, and the c-statistic.

RESULTS

Across all 12 scenarios (2 diseases × 6 data-generating processes), penalized logistic regression displayed very low optimism even when the number of EPV was very low. Random forests and bagged trees tended to be the most data hungry and displayed the greatest optimism.

CONCLUSIONS

When assessed using calibration, penalized logistic regression was substantially less data hungry than methods from the machine learning literature.

摘要

背景

机器学习方法越来越多地用于预测临床结果。乐观度是推导样本与验证样本之间模型性能的差异。术语“数据饥渴度”是指一种建模技术生成具有最小乐观度的预测模型所需的样本量。我们的目的是比较使用校准评估时不同统计和机器学习方法的相对数据饥渴度。

方法

我们使用蒙特卡洛模拟来评估每个变量的事件数(EPV)对六种学习方法在评估模型校准时的乐观度的影响:无惩罚逻辑回归、岭回归、套索回归、袋装分类树、随机森林以及以树为基础学习器的随机梯度提升机。我们在两个大型心血管数据集中进行了模拟,每个数据集都包含一个独立的推导样本和验证样本:急性心肌梗死住院患者和心力衰竭住院患者。我们使用了六个数据生成过程,每个过程基于六种学习方法中的一种。我们设定样本量,使EPV的数量以10为增量从10变化到200。我们在每个模拟推导样本中应用六种预测方法,并使用综合校准指数、校准截距和校准斜率在模拟验证样本中评估校准。我们还检查了Nagelkerke's R、缩放后的Brier评分和c统计量。

结果

在所有12种情况(2种疾病×6种数据生成过程)中,即使EPV数量非常低,惩罚逻辑回归的乐观度也非常低。随机森林和袋装树往往是最需要数据的,并且表现出最大程度的乐观度。

结论

在校准评估时惩罚逻辑回归的数据饥渴度远低于机器学习文献中的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/930a57b7558d/41512_2024_179_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/aada641dd27a/41512_2024_179_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/889b714f4951/41512_2024_179_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/420956338fa8/41512_2024_179_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/df89e99ccdb1/41512_2024_179_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/cc19fb2f1e02/41512_2024_179_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/2c7eea85002a/41512_2024_179_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/d9cc11c467d4/41512_2024_179_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/14bd0623d7c5/41512_2024_179_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/4349a0052a80/41512_2024_179_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/950f072a4043/41512_2024_179_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/ffb56ce8e835/41512_2024_179_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/930a57b7558d/41512_2024_179_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/aada641dd27a/41512_2024_179_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/889b714f4951/41512_2024_179_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/420956338fa8/41512_2024_179_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/df89e99ccdb1/41512_2024_179_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/cc19fb2f1e02/41512_2024_179_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/2c7eea85002a/41512_2024_179_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/d9cc11c467d4/41512_2024_179_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/14bd0623d7c5/41512_2024_179_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/4349a0052a80/41512_2024_179_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/950f072a4043/41512_2024_179_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/ffb56ce8e835/41512_2024_179_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/41cb/11539735/930a57b7558d/41512_2024_179_Fig12_HTML.jpg

相似文献

1
The relative data hungriness of unpenalized and penalized logistic regression and ensemble-based machine learning methods: the case of calibration.无惩罚和有惩罚逻辑回归以及基于集成的机器学习方法的相对数据需求量:校准的案例
Diagn Progn Res. 2024 Nov 5;8(1):15. doi: 10.1186/s41512-024-00179-z.
2
Predictive performance of machine and statistical learning methods: Impact of data-generating processes on external validity in the "large N, small p" setting.机器学习和统计学习方法的预测性能:在“大数据量、小样本量”设置下,数据生成过程对外部有效性的影响。
Stat Methods Med Res. 2021 Jun;30(6):1465-1483. doi: 10.1177/09622802211002867. Epub 2021 Apr 13.
3
Empirical analyses and simulations showed that different machine and statistical learning methods had differing performance for predicting blood pressure.实证分析和模拟表明,不同的机器和统计学习方法在预测血压方面的表现有所不同。
Sci Rep. 2022 Jun 3;12(1):9312. doi: 10.1038/s41598-022-13015-5.
4
Dementia risk prediction in individuals with mild cognitive impairment: a comparison of Cox regression and machine learning models.轻度认知障碍个体的痴呆风险预测:Cox 回归和机器学习模型的比较。
BMC Med Res Methodol. 2022 Nov 2;22(1):284. doi: 10.1186/s12874-022-01754-y.
5
Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints.现代建模技术对数据需求极大:一项用于预测二分结局的模拟研究。
BMC Med Res Methodol. 2014 Dec 22;14:137. doi: 10.1186/1471-2288-14-137.
6
Classification of imbalanced data using machine learning algorithms to predict the risk of renal graft failures in Ethiopia.使用机器学习算法对不平衡数据进行分类,以预测埃塞俄比亚肾移植失败的风险。
BMC Med Inform Decis Mak. 2023 May 22;23(1):98. doi: 10.1186/s12911-023-02185-5.
7
Does the SORG Algorithm Predict 5-year Survival in Patients with Chondrosarcoma? An External Validation.SORG 算法能否预测软骨肉瘤患者的 5 年生存率?一项外部验证。
Clin Orthop Relat Res. 2019 Oct;477(10):2296-2303. doi: 10.1097/CORR.0000000000000748.
8
Evaluating the performance of machine learning methods and variable selection methods for predicting difficult-to-measure traits in Holstein dairy cattle using milk infrared spectral data.利用牛奶近红外光谱数据评估机器学习方法和变量选择方法在荷斯坦奶牛中预测难以测量性状的性能。
J Dairy Sci. 2021 Jul;104(7):8107-8121. doi: 10.3168/jds.2020-19861. Epub 2021 Apr 15.
9
10
Machine learning-based risk prediction of malignant arrhythmia in hospitalized patients with heart failure.基于机器学习的心力衰竭住院患者恶性心律失常风险预测。
ESC Heart Fail. 2021 Dec;8(6):5363-5371. doi: 10.1002/ehf2.13627. Epub 2021 Sep 28.

引用本文的文献

1
Developing a Predictive Model for Significant Prostate Cancer Detection in Prostatic Biopsies from Seven Clinical Variables: Is Machine Learning Superior to Logistic Regression?基于七个临床变量构建前列腺活检中显著前列腺癌检测的预测模型:机器学习是否优于逻辑回归?
Cancers (Basel). 2025 Mar 25;17(7):1101. doi: 10.3390/cancers17071101.

本文引用的文献

1
Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models.系统评价确定了基于机器学习的预测模型研究的设计和方法实施情况。
J Clin Epidemiol. 2023 Feb;154:8-22. doi: 10.1016/j.jclinepi.2022.11.015. Epub 2022 Nov 25.
2
Trial of an Intervention to Improve Acute Heart Failure Outcomes.改善急性心力衰竭预后的干预措施试验
N Engl J Med. 2023 Jan 5;388(1):22-32. doi: 10.1056/NEJMoa2211680. Epub 2022 Nov 5.
3
Empirical analyses and simulations showed that different machine and statistical learning methods had differing performance for predicting blood pressure.
实证分析和模拟表明,不同的机器和统计学习方法在预测血压方面的表现有所不同。
Sci Rep. 2022 Jun 3;12(1):9312. doi: 10.1038/s41598-022-13015-5.
4
Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review.基于机器学习的肿瘤预后预测模型的方法学研究:系统评价。
BMC Med Res Methodol. 2022 Apr 8;22(1):101. doi: 10.1186/s12874-022-01577-x.
5
Predictive performance of machine and statistical learning methods: Impact of data-generating processes on external validity in the "large N, small p" setting.机器学习和统计学习方法的预测性能:在“大数据量、小样本量”设置下,数据生成过程对外部有效性的影响。
Stat Methods Med Res. 2021 Jun;30(6):1465-1483. doi: 10.1177/09622802211002867. Epub 2021 Apr 13.
6
Machine Learning Compared With Conventional Statistical Models for Predicting Myocardial Infarction Readmission and Mortality: A Systematic Review.机器学习与传统统计模型预测心肌梗死再入院和死亡率的比较:系统评价。
Can J Cardiol. 2021 Aug;37(8):1207-1214. doi: 10.1016/j.cjca.2021.02.020. Epub 2021 Mar 5.
7
Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small.惩罚和收缩方法会产生不可靠的临床预测模型,尤其是在样本量较小时。
J Clin Epidemiol. 2021 Apr;132:88-96. doi: 10.1016/j.jclinepi.2020.12.005. Epub 2020 Dec 8.
8
Machine learning vs. conventional statistical models for predicting heart failure readmission and mortality.用于预测心力衰竭再入院和死亡率的机器学习与传统统计模型对比
ESC Heart Fail. 2021 Feb;8(1):106-115. doi: 10.1002/ehf2.13073. Epub 2020 Nov 17.
9
The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models.综合校准指数(ICI)及其相关指标,用于量化逻辑回归模型的校准。
Stat Med. 2019 Sep 20;38(21):4051-4065. doi: 10.1002/sim.8281. Epub 2019 Jul 3.
10
Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.用于评估预测模型、分子标志物和诊断测试的净效益方法。
BMJ. 2016 Jan 25;352:i6. doi: 10.1136/bmj.i6.