你所看到的未必是你所得到的：回归类型模型中过拟合的简要非技术介绍。

What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models.

作者信息

Babyak Michael A

机构信息

Duke University Medical Center, Durham, NC 27710, USA.

出版信息

Psychosom Med. 2004 May-Jun;66(3):411-21. doi: 10.1097/01.psy.0000127692.23278.a9.

DOI:10.1097/01.psy.0000127692.23278.a9

PMID:15184705

Abstract

Statistical models, such as linear or logistic regression or survival analysis, are frequently used as a means to answer scientific questions in psychosomatic research. Many who use these techniques, however, apparently fail to appreciate fully the problem of overfitting, ie, capitalizing on the idiosyncrasies of the sample at hand. Overfitted models will fail to replicate in future samples, thus creating considerable uncertainty about the scientific merit of the finding. The present article is a nontechnical discussion of the concept of overfitting and is intended to be accessible to readers with varying levels of statistical expertise. The notion of overfitting is presented in terms of asking too much from the available data. Given a certain number of observations in a data set, there is an upper limit to the complexity of the model that can be derived with any acceptable degree of uncertainty. Complexity arises as a function of the number of degrees of freedom expended (the number of predictors including complex terms such as interactions and nonlinear terms) against the same data set during any stage of the data analysis. Theoretical and empirical evidence--with a special focus on the results of computer simulation studies--is presented to demonstrate the practical consequences of overfitting with respect to scientific inference. Three common practices--automated variable selection, pretesting of candidate predictors, and dichotomization of continuous variables--are shown to pose a considerable risk for spurious findings in models. The dilemma between overfitting and exploring candidate confounders is also discussed. Alternative means of guarding against overfitting are discussed, including variable aggregation and the fixing of coefficients a priori. Techniques that account and correct for complexity, including shrinkage and penalization, also are introduced.

摘要

统计模型，如线性回归、逻辑回归或生存分析，经常被用作回答心身研究中科学问题的一种手段。然而，许多使用这些技术的人显然没有充分认识到过度拟合的问题，即利用手头样本的特质。过度拟合的模型在未来样本中无法复制，从而给研究结果的科学价值带来很大的不确定性。本文是对过度拟合概念的非技术性讨论，旨在让具有不同统计专业水平的读者都能理解。过度拟合的概念是从对现有数据要求过高的角度来阐述的。给定数据集中一定数量的观测值，能够以任何可接受的不确定度推导出来的模型复杂度存在上限。在数据分析的任何阶段，复杂度是根据针对同一数据集所耗费的自由度数量（预测变量的数量，包括诸如交互项和非线性项等复杂项）而产生的。本文给出了理论和实证证据——特别关注计算机模拟研究的结果——以证明过度拟合对科学推断的实际影响。三种常见做法——自动变量选择、候选预测变量的预测试以及连续变量的二分法——被证明会给模型中的虚假发现带来相当大的风险。还讨论了过度拟合与探索候选混杂因素之间的困境。文中讨论了防范过度拟合的其他方法，包括变量聚合和先验系数固定。还介绍了考虑和校正复杂度的技术，包括收缩和惩罚。

相似文献

What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models.

Psychosom Med. 2004 May-Jun;66(3):411-21. doi: 10.1097/01.psy.0000127692.23278.a9.

Multivariable models in biobehavioral research.

Psychosom Med. 2009 Feb;71(2):205-16. doi: 10.1097/PSY.0b013e3181906e57.

Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events.

Stat Med. 2016 Mar 30;35(7):1159-77. doi: 10.1002/sim.6782. Epub 2015 Oct 29.

Spurious interaction as a result of categorization.

BMC Med Res Methodol. 2019 Feb 7;19(1):28. doi: 10.1186/s12874-019-0667-2.

Multivariate modeling of complications with data driven variable selection: guarding against overfitting and effects of data set size.

Radiother Oncol. 2012 Oct;105(1):115-21. doi: 10.1016/j.radonc.2011.12.006. Epub 2012 Jan 20.

Penalization and shrinkage methods produced unreliable clinical prediction models especially when sample size was small.

J Clin Epidemiol. 2021 Apr;132:88-96. doi: 10.1016/j.jclinepi.2020.12.005. Epub 2020 Dec 8.

Mediation analysis in psychosomatic medicine research.

Psychosom Med. 2011 Jan;73(1):29-43. doi: 10.1097/PSY.0b013e318200a54b. Epub 2010 Dec 10.

Measuring overfitting in nonlinear models: a new method and an application to health expenditures.

Health Econ. 2015 Jan;24(1):75-85. doi: 10.1002/hec.3003. Epub 2013 Oct 9.

Confirmatory factor analysis: an introduction for psychosomatic medicine researchers.

Psychosom Med. 2010 Jul;72(6):587-97. doi: 10.1097/PSY.0b013e3181de3f8a. Epub 2010 May 13.

[What sample size is needed to calculate complex regression models?].

Psychother Psychosom Med Psychol. 2011 Sep;61(9-10):435. doi: 10.1055/s-0031-1276912. Epub 2011 Sep 23.

引用本文的文献

Comment on "KidsBrainIT: Visualization of the Impact of Cerebral Perfusion Pressure Insult Intensity and Duration on Childhood Brain Trauma Outcome".

Neurocrit Care. 2025 Sep 3. doi: 10.1007/s12028-025-02343-9.

Predicting cognition after subthalamic Deep Brain Stimulation in Parkinson's Disease.

NPJ Parkinsons Dis. 2025 Aug 28;11(1):265. doi: 10.1038/s41531-025-01128-3.

Coping with the toll of child sexual abuse investigations.

Front Psychol. 2025 Aug 5;16:1584034. doi: 10.3389/fpsyg.2025.1584034. eCollection 2025.

Bridges to treatment satisfaction: the roles of trauma, social support, race and ethnicity among perinatal women receiving behavioural activation therapy.

BMC Med. 2025 Aug 20;23(1):489. doi: 10.1186/s12916-025-04272-y.

AI and Machine Learning Terminology in Medicine, Psychology, and Social Sciences: Tutorial and Practical Recommendations.

J Med Internet Res. 2025 Aug 18;27:e66100. doi: 10.2196/66100.

Viral suppression and continued participation in the Community Retail Pharmacy Drug Distribution Point model among people living with HIV in Uganda.

BMC Health Serv Res. 2025 Aug 6;25(1):1034. doi: 10.1186/s12913-025-13229-z.

Outcomes and Cost-Benefit of a National Suicide Reattempt Prevention Program.

JAMA Netw Open. 2025 Aug 1;8(8):e2525671. doi: 10.1001/jamanetworkopen.2025.25671.

Patterns of claims and determinants of claim rejections in Kuwait's National Health Insurance for Retirees (AFYA): a comprehensive analysis.

Front Public Health. 2025 Jul 22;13:1606980. doi: 10.3389/fpubh.2025.1606980. eCollection 2025.

Predicting Injury in Collegiate Baseball and Softball Athletes Using Functional Testing: A Pilot Study.

Muscles. 2025 Apr 9;4(2):10. doi: 10.3390/muscles4020010.

Diagnostic value of Peptest™ combined with gastroesophageal reflux disease questionnaire in identifying patients with gastroesophageal reflux-induced chronic cough.

Chron Respir Dis. 2025 Jan-Dec;22:14799731251364875. doi: 10.1177/14799731251364875. Epub 2025 Aug 1.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

你所看到的未必是你所得到的：回归类型模型中过拟合的简要非技术介绍。

What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献