Suppr超能文献

惩罚和收缩方法会产生不可靠的临床预测模型,尤其是在样本量较小时。

Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small.

机构信息

Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, UK, ST5 5BG.

Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, UK, ST5 5BG.

出版信息

J Clin Epidemiol. 2021 Apr;132:88-96. doi: 10.1016/j.jclinepi.2020.12.005. Epub 2020 Dec 8.

Abstract

OBJECTIVES

When developing a clinical prediction model, penalization techniques are recommended to address overfitting, as they shrink predictor effect estimates toward the null and reduce mean-square prediction error in new individuals. However, shrinkage and penalty terms ('tuning parameters') are estimated with uncertainty from the development data set. We examined the magnitude of this uncertainty and the subsequent impact on prediction model performance.

STUDY DESIGN AND SETTING

This study comprises applied examples and a simulation study of the following methods: uniform shrinkage (estimated via a closed-form solution or bootstrapping), ridge regression, the lasso, and elastic net.

RESULTS

In a particular model development data set, penalization methods can be unreliable because tuning parameters are estimated with large uncertainty. This is of most concern when development data sets have a small effective sample size and the model's Cox-Snell R is low. The problem can lead to considerable miscalibration of model predictions in new individuals.

CONCLUSION

Penalization methods are not a 'carte blanche'; they do not guarantee a reliable prediction model is developed. They are more unreliable when needed most (i.e., when overfitting may be large). We recommend they are best applied with large effective sample sizes, as identified from recent sample size calculations that aim to minimize the potential for model overfitting and precisely estimate key parameters.

摘要

目的

在开发临床预测模型时,推荐使用惩罚技术来解决过拟合问题,因为它们会使预测因子的效应估计值趋近于零,并减少新个体的均方预测误差。然而,收缩和惩罚项(“调整参数”)是根据开发数据集的不确定性进行估计的。我们研究了这种不确定性的大小及其对预测模型性能的后续影响。

研究设计和设置

本研究包括以下方法的应用实例和模拟研究:均匀收缩(通过闭式解或引导进行估计)、岭回归、套索和弹性网络。

结果

在特定的模型开发数据集,惩罚方法可能不可靠,因为调整参数的估计具有很大的不确定性。当开发数据集的有效样本量较小且模型的 Cox-Snell R 值较低时,情况最为严重。这个问题可能导致对新个体的模型预测进行相当大的错误校准。

结论

惩罚方法不是“一刀切”的;它们不能保证开发出可靠的预测模型。当过拟合可能较大时,它们的可靠性更差。我们建议在具有较大有效样本量的情况下应用它们,这些样本量是根据最近旨在最小化模型过拟合风险并精确估计关键参数的样本量计算确定的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5407/8026952/cf766dc90e33/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验