• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

惩罚和收缩方法会产生不可靠的临床预测模型,尤其是在样本量较小时。

Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small.

机构信息

Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, UK, ST5 5BG.

Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, UK, ST5 5BG.

出版信息

J Clin Epidemiol. 2021 Apr;132:88-96. doi: 10.1016/j.jclinepi.2020.12.005. Epub 2020 Dec 8.

DOI:10.1016/j.jclinepi.2020.12.005
PMID:33307188
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8026952/
Abstract

OBJECTIVES

When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms ('tuning parameters') are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance.

STUDY DESIGN AND SETTING

This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net.

RESULTS

In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model's Cox-Snell R is low. The problem can lead to considerable miscalibration of model predictions in new individuals.

CONCLUSION

Penalization methods are not a 'carte blanche'; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters.

摘要

目的

在开发临床预测模型时,推荐使用惩罚技术来解决过拟合问题,因为它们会使预测因子的效应估计值趋近于零,并减少新个体的均方预测误差。然而,收缩和惩罚项(“调整参数”)是根据开发数据集的不确定性进行估计的。我们研究了这种不确定性的大小及其对预测模型性能的后续影响。

研究设计和设置

本研究包括以下方法的应用实例和模拟研究:均匀收缩(通过闭式解或引导进行估计)、岭回归、套索和弹性网络。

结果

在特定的模型开发数据集,惩罚方法可能不可靠,因为调整参数的估计具有很大的不确定性。当开发数据集的有效样本量较小且模型的 Cox-Snell R 值较低时,情况最为严重。这个问题可能导致对新个体的模型预测进行相当大的错误校准。

结论

惩罚方法不是“一刀切”的;它们不能保证开发出可靠的预测模型。当过拟合可能较大时,它们的可靠性更差。我们建议在具有较大有效样本量的情况下应用它们,这些样本量是根据最近旨在最小化模型过拟合风险并精确估计关键参数的样本量计算确定的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/24943b3e0857/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/cf766dc90e33/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/ae98a9780e9e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/24943b3e0857/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/cf766dc90e33/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/ae98a9780e9e/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/24943b3e0857/gr3.jpg

相似文献

1
Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small.惩罚和收缩方法会产生不可靠的临床预测模型,尤其是在样本量较小时。
J Clin Epidemiol. 2021 Apr;132:88-96. doi: 10.1016/j.jclinepi.2020.12.005. Epub 2020 Dec 8.
2
Evaluating key predictors of breast cancer through survival: a comparison of AFT frailty models with LASSO, ridge, and elastic net regularization.通过生存分析评估乳腺癌的关键预测因素:AFT脆弱模型与LASSO、岭回归和弹性网络正则化的比较
BMC Cancer. 2025 Apr 11;25(1):665. doi: 10.1186/s12885-025-14040-z.
3
Developing clinical prediction models when adhering to minimum sample size recommendations: The importance of quantifying bootstrap variability in tuning parameters and predictive performance.在遵守最小样本量建议的情况下开发临床预测模型:在调整参数和预测性能时量化引导变异性的重要性。
Stat Methods Med Res. 2021 Dec;30(12):2545-2561. doi: 10.1177/09622802211046388. Epub 2021 Oct 8.
4
Regression shrinkage methods for clinical prediction models do not guarantee improved performance: Simulation study.回归收缩方法在临床预测模型中并不能保证性能得到改善:模拟研究。
Stat Methods Med Res. 2020 Nov;29(11):3166-3178. doi: 10.1177/0962280220921415. Epub 2020 May 13.
5
Selecting Shrinkage Parameters for Effect Estimation: The Multi-Ethnic Study of Atherosclerosis.选择用于效应估计的收缩参数:动脉粥样硬化的多民族研究。
Am J Epidemiol. 2018 Feb 1;187(2):358-365. doi: 10.1093/aje/kwx225.
6
Stability of clinical prediction models developed using statistical or machine learning methods.基于统计或机器学习方法开发的临床预测模型的稳定性。
Biom J. 2023 Dec;65(8):e2200302. doi: 10.1002/bimj.202200302. Epub 2023 Jul 19.
7
Comparison of likelihood penalization and variance decomposition approaches for clinical prediction models: A simulation study.似然惩罚和方差分解方法在临床预测模型中的比较:一项模拟研究。
Biom J. 2024 Jan;66(1):e2200108. doi: 10.1002/bimj.202200108. Epub 2023 May 18.
8
Penalized Regression Methods With Modified Cross-Validation and Bootstrap Tuning Produce Better Prediction Models.惩罚回归方法通过修正的交叉验证和引导调整可以产生更好的预测模型。
Biom J. 2024 Jul;66(5):e202300245. doi: 10.1002/bimj.202300245.
9
Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes.建立多变量预测模型的最小样本量:第二部分 - 二分类和生存数据。
Stat Med. 2019 Mar 30;38(7):1276-1296. doi: 10.1002/sim.7992. Epub 2018 Oct 24.
10
Minimum sample size for developing a multivariable prediction model: Part I - Continuous outcomes.建立多变量预测模型的最小样本量:第一部分-连续结局。
Stat Med. 2019 Mar 30;38(7):1262-1275. doi: 10.1002/sim.7993. Epub 2018 Oct 22.

引用本文的文献

1
Development and validation of a postpartum cardiovascular disease risk prediction model in women incorporating reproductive and pregnancy-related predictors.纳入生殖和妊娠相关预测因素的产后女性心血管疾病风险预测模型的开发与验证
BMC Med. 2025 Aug 29;23(1):508. doi: 10.1186/s12916-025-04229-1.
2
Enhancing body fat prediction with WGAN-GP data augmentation and XGBoost algorithm.利用WGAN-GP数据增强和XGBoost算法提高体脂预测能力。
Sci Prog. 2025 Jul-Sep;108(3):368504251366850. doi: 10.1177/00368504251366850. Epub 2025 Aug 6.
3
A decomposition of Fisher's information to inform sample size for developing or updating fair and precise clinical prediction models for individual risk-part 1: binary outcomes.

本文引用的文献

1
Regression shrinkage methods for clinical prediction models do not guarantee improved performance: Simulation study.回归收缩方法在临床预测模型中并不能保证性能得到改善:模拟研究。
Stat Methods Med Res. 2020 Nov;29(11):3166-3178. doi: 10.1177/0962280220921415. Epub 2020 May 13.
2
Calculating the sample size required for developing a clinical prediction model.计算开发临床预测模型所需的样本量。
BMJ. 2020 Mar 18;368:m441. doi: 10.1136/bmj.m441.
3
The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models.
分解费舍尔信息以确定样本量,用于开发或更新针对个体风险的公平且精确的临床预测模型——第1部分:二元结局
Diagn Progn Res. 2025 Jul 8;9(1):14. doi: 10.1186/s41512-025-00193-9.
4
Variable selection for causal inference, prediction, and descriptive research: a narrative review of recommendations.因果推断、预测和描述性研究中的变量选择:建议的叙述性综述
Eur Heart J Open. 2025 Jun 4;5(3):oeaf070. doi: 10.1093/ehjopen/oeaf070. eCollection 2025 May.
5
Alternatives to default shrinkage methods can improve prediction accuracy, calibration, and coverage: A methods comparison study.默认收缩方法的替代方法可提高预测准确性、校准和覆盖率:一项方法比较研究。
Stat Methods Med Res. 2025 Jul;34(7):1342-1355. doi: 10.1177/09622802251338440. Epub 2025 May 29.
6
Predicting the outcome of psychological treatments for borderline personality disorder and posttraumatic stress disorder: a machine learning approach to predict long-term outcome of Narrative Exposure Therapy vs. Dialectical Behavioral Therapy based treatment.预测边缘型人格障碍和创伤后应激障碍心理治疗的结果:一种基于机器学习的方法来预测叙事暴露疗法与辩证行为疗法治疗的长期结果。
Eur J Psychotraumatol. 2025 Dec;16(1):2497161. doi: 10.1080/20008066.2025.2497161. Epub 2025 May 7.
7
Statistical primer: sample size considerations for developing and validating clinical prediction models.统计学入门:开发和验证临床预测模型时的样本量考量
Eur J Cardiothorac Surg. 2025 May 6;67(5). doi: 10.1093/ejcts/ezaf142.
8
Gestational exposures to mixtures of multiple chemical classes and autism spectrum disorder in the MARBLES study.在MARBLES研究中孕期暴露于多种化学类别混合物与自闭症谱系障碍的关系
Environ Res. 2025 Aug 1;278:121646. doi: 10.1016/j.envres.2025.121646. Epub 2025 Apr 16.
9
James-Stein Estimator Improves Accuracy and Sample Efficiency in Human Kinematic and Metabolic Data.詹姆斯 - 斯坦估计器提高了人体运动学和代谢数据的准确性及样本效率。
Ann Biomed Eng. 2025 Apr 16. doi: 10.1007/s10439-025-03718-x.
10
Radiological Predictors of Cognitive Impairment in Paediatric Brain Tumours Using Multiparametric Magnetic Resonance Imaging: A Review of Current Practice, Challenges and Future Directions.使用多参数磁共振成像的小儿脑肿瘤认知障碍的放射学预测指标:当前实践、挑战与未来方向综述
Cancers (Basel). 2025 Mar 11;17(6):947. doi: 10.3390/cancers17060947.
综合校准指数(ICI)及其相关指标,用于量化逻辑回归模型的校准。
Stat Med. 2019 Sep 20;38(21):4051-4065. doi: 10.1002/sim.8281. Epub 2019 Jul 3.
4
Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes.建立多变量预测模型的最小样本量:第二部分 - 二分类和生存数据。
Stat Med. 2019 Mar 30;38(7):1276-1296. doi: 10.1002/sim.7992. Epub 2018 Oct 24.
5
Minimum sample size for developing a multivariable prediction model: Part I - Continuous outcomes.建立多变量预测模型的最小样本量:第一部分-连续结局。
Stat Med. 2019 Mar 30;38(7):1262-1275. doi: 10.1002/sim.7993. Epub 2018 Oct 22.
6
A calibration hierarchy for risk models was defined: from utopia to empirical data.定义了风险模型的校准层次结构:从理想状态到经验数据。
J Clin Epidemiol. 2016 Jun;74:167-76. doi: 10.1016/j.jclinepi.2015.12.005. Epub 2016 Jan 6.
7
Sample size considerations for the external validation of a multivariable prognostic model: a resampling study.多变量预后模型外部验证的样本量考量:一项重抽样研究
Stat Med. 2016 Jan 30;35(2):214-26. doi: 10.1002/sim.6787. Epub 2015 Nov 9.
8
How to develop a more accurate risk prediction model when there are few events.当事件数量较少时,如何开发一个更准确的风险预测模型。
BMJ. 2015 Aug 11;351:h3868. doi: 10.1136/bmj.h3868.
9
Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration.透明报告个体预后或诊断的多变量预测模型(TRIPOD):解释和说明。
Ann Intern Med. 2015 Jan 6;162(1):W1-73. doi: 10.7326/M14-0698.
10
Assessing calibration of multinomial risk prediction models.评估多项风险预测模型的校准
Stat Med. 2014 Jul 10;33(15):2585-96. doi: 10.1002/sim.6114. Epub 2014 Feb 18.