类别不平衡校正对风险预测模型的危害：使用逻辑回归进行说明和模拟。

The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression.

机构信息

Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, The Netherlands.

Department of Development and Regeneration, KU Leuven, Leuven, Belgium.

出版信息

J Am Med Inform Assoc. 2022 Aug 16;29(9):1525-1534. doi: 10.1093/jamia/ocac093.

DOI:10.1093/jamia/ocac093

PMID:35686364

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9382395/

Abstract

OBJECTIVE

Methods to correct class imbalance (imbalance between the frequency of outcome events and nonevents) are receiving increasing interest for developing prediction models. We examined the effect of imbalance correction on the performance of logistic regression models.

MATERIAL AND METHODS

Prediction models were developed using standard and penalized (ridge) logistic regression under 4 methods to address class imbalance: no correction, random undersampling, random oversampling, and SMOTE. Model performance was evaluated in terms of discrimination, calibration, and classification. Using Monte Carlo simulations, we studied the impact of training set size, number of predictors, and the outcome event fraction. A case study on prediction modeling for ovarian cancer diagnosis is presented.

RESULTS

The use of random undersampling, random oversampling, or SMOTE yielded poorly calibrated models: the probability to belong to the minority class was strongly overestimated. These methods did not result in higher areas under the ROC curve when compared with models developed without correction for class imbalance. Although imbalance correction improved the balance between sensitivity and specificity, similar results were obtained by shifting the probability threshold instead.

DISCUSSION

Imbalance correction led to models with strong miscalibration without better ability to distinguish between patients with and without the outcome event. The inaccurate probability estimates reduce the clinical utility of the model, because decisions about treatment are ill-informed.

CONCLUSION

Outcome imbalance is not a problem in itself, imbalance correction may even worsen model performance.

摘要

目的

为了开发预测模型，校正类别不平衡（结局事件与非事件的频率之间的不平衡）的方法正受到越来越多的关注。我们研究了不平衡校正对逻辑回归模型性能的影响。

材料和方法

使用标准逻辑回归和惩罚（岭）逻辑回归，通过 4 种方法来解决类别不平衡问题：不校正、随机欠采样、随机过采样和 SMOTE。根据判别能力、校准和分类来评估模型性能。使用蒙特卡罗模拟，我们研究了训练集大小、预测变量数量和结局事件分数的影响。呈现了卵巢癌诊断预测模型的案例研究。

结果

随机欠采样、随机过采样或 SMOTE 的使用导致校准不良的模型：属于少数类别的概率被严重高估。与未校正类别不平衡的模型相比，这些方法并没有导致 ROC 曲线下面积更高。尽管不平衡校正提高了敏感性和特异性之间的平衡，但通过转移概率阈值也可以获得类似的结果。

讨论

不平衡校正导致模型校准严重错误，而无法更好地区分有无结局事件的患者。不准确的概率估计降低了模型的临床实用性，因为关于治疗的决策是基于不充分的信息。

结论

结局不平衡本身并不是问题，不平衡校正甚至可能会降低模型性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e20/9382395/d20d5a4defae/ocac093f1.jpg

相似文献

The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression.类别不平衡校正对风险预测模型的危害：使用逻辑回归进行说明和模拟。

J Am Med Inform Assoc. 2022 Aug 16;29(9):1525-1534. doi: 10.1093/jamia/ocac093.

Understanding random resampling techniques for class imbalance correction and their consequences on calibration and discrimination of clinical risk prediction models.理解随机重采样技术在类别不平衡校正中的应用及其对临床风险预测模型校准和区分的影响。

J Biomed Inform. 2024 Jul;155:104666. doi: 10.1016/j.jbi.2024.104666. Epub 2024 Jun 6.

Comparison of discrimination and calibration performance of ECG-based machine learning models for prediction of new-onset atrial fibrillation.基于心电图的机器学习模型预测新发心房颤动的鉴别和校准性能比较。

BMC Med Res Methodol. 2023 Jul 22;23(1):169. doi: 10.1186/s12874-023-01989-3.

Implications of resampling data to address the class imbalance problem (IRCIP): an evaluation of impact on performance between classification algorithms in medical data.重采样数据以解决类别不平衡问题的影响（IRCIP）：医学数据中分类算法间性能影响的评估

JAMIA Open. 2023 May 31;6(2):ooad033. doi: 10.1093/jamiaopen/ooad033. eCollection 2023 Jul.

Improvement of P300-Based Brain-Computer Interfaces for Home Appliances Control by Data Balancing Techniques.基于 P300 的脑机接口的数据均衡技术在家用电器控制中的改进。

Sensors (Basel). 2020 Sep 29;20(19):5576. doi: 10.3390/s20195576.

A systematic study of the class imbalance problem in convolutional neural networks.卷积神经网络中类不平衡问题的系统研究。

Neural Netw. 2018 Oct;106:249-259. doi: 10.1016/j.neunet.2018.07.011. Epub 2018 Jul 29.

Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets.基于结构-活性关系的高度不平衡Tox21数据集的化学分类

J Cheminform. 2020 Oct 27;12(1):66. doi: 10.1186/s13321-020-00468-x.

SMOTE for high-dimensional class-imbalanced data.过采样处理高维类别不平衡数据。

BMC Bioinformatics. 2013 Mar 22;14:106. doi: 10.1186/1471-2105-14-106.

Handling Class Imbalance in Machine Learning-based Prediction Models: A Case Study in Asthma Management.基于机器学习的预测模型中的类别不平衡处理：哮喘管理案例研究。

Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-5. doi: 10.1109/EMBC40787.2023.10340751.

Interpretability and Class Imbalance in Prediction Models for Pain Volatility in Manage My Pain App Users: Analysis Using Feature Selection and Majority Voting Methods.“管理我的疼痛”应用程序用户疼痛波动预测模型中的可解释性与类别不平衡：使用特征选择和多数投票方法的分析

JMIR Med Inform. 2019 Nov 20;7(4):e15601. doi: 10.2196/15601.

引用本文的文献

Machine learning prediction models for mode of delivery in prolonged pregnancies in Sweden.瑞典过期妊娠分娩方式的机器学习预测模型

Sci Rep. 2025 Sep 12;15(1):32487. doi: 10.1038/s41598-025-19198-x.

Prevalence and risk factors of suicidal ideation amongst unaccompanied young refugees: a machine learning approach.无人陪伴的年轻难民中自杀意念的患病率及危险因素：一种机器学习方法。

Eur Child Adolesc Psychiatry. 2025 Sep 12. doi: 10.1007/s00787-025-02828-0.

Ensemble learning to enhance accurate identification of patients with glaucoma using electronic health records.使用电子健康记录的集成学习以提高青光眼患者的准确识别

JAMIA Open. 2025 Aug 10;8(4):ooaf080. doi: 10.1093/jamiaopen/ooaf080. eCollection 2025 Aug.

Development and validation of machine learning-based risk prediction models for ICU-acquired weakness: a prospective cohort study.基于机器学习的重症监护病房获得性肌无力风险预测模型的开发与验证：一项前瞻性队列研究。

Eur J Med Res. 2025 Jul 24;30(1):666. doi: 10.1186/s40001-025-02930-8.

Development and validation of a dynamic early warning system with time-varying machine learning models for predicting hemodynamic instability in critical care: a multicohort study.用于预测重症监护中血流动力学不稳定的具有时变机器学习模型的动态预警系统的开发与验证：一项多队列研究

Crit Care. 2025 Jul 23;29(1):318. doi: 10.1186/s13054-025-05553-x.

Open-source computational pipeline flags instances of acute respiratory distress syndrome in mechanically ventilated adult patients.开源计算管道可标记接受机械通气的成年患者的急性呼吸窘迫综合征病例。

Nat Commun. 2025 Jul 23;16(1):6787. doi: 10.1038/s41467-025-61418-5.

A holistic framework for intradialytic hypotension prediction using generative adversarial networks-based data balancing.一种基于生成对抗网络的数据平衡用于透析中低血压预测的整体框架。

BMC Med Inform Decis Mak. 2025 Jul 10;25(1):257. doi: 10.1186/s12911-025-03094-5.

Development and validation of a machine learning-based model for perioperative stroke prediction in noncardiac, nonvascular, and nonneurosurgical patients.基于机器学习的非心脏、非血管和非神经外科手术患者围手术期卒中预测模型的开发与验证

Front Physiol. 2025 Jun 20;16:1624898. doi: 10.3389/fphys.2025.1624898. eCollection 2025.

Multimodal Prediction of Psychosis in the Prospective MoBa Birth Cohort.前瞻性母婴队列研究中精神病的多模态预测

Res Sq. 2025 Jun 20:rs.3.rs-6783339. doi: 10.21203/rs.3.rs-6783339/v1.

Predictive machine learning model for 30-day hospital readmissions in a tertiary healthcare setting.三级医疗环境中30天再入院的预测性机器学习模型。

Bioinform Adv. 2025 May 24;5(1):vbaf121. doi: 10.1093/bioadv/vbaf121. eCollection 2025.

本文引用的文献

Clinical decisions using AI must consider patient values.临床决策使用人工智能必须考虑患者的价值观。

Nat Med. 2022 Feb;28(2):229-232. doi: 10.1038/s41591-021-01624-y.

The class imbalance problem.类别不平衡问题。

Nat Methods. 2021 Nov;18(11):1270-1272. doi: 10.1038/s41592-021-01302-4.

Regression shrinkage methods for clinical prediction models do not guarantee improved performance: Simulation study.回归收缩方法在临床预测模型中并不能保证性能得到改善：模拟研究。

Stat Methods Med Res. 2020 Nov;29(11):3166-3178. doi: 10.1177/0962280220921415. Epub 2020 May 13.

Calculating the sample size required for developing a clinical prediction model.计算开发临床预测模型所需的样本量。

BMJ. 2020 Mar 18;368:m441. doi: 10.1136/bmj.m441.

Impact of a deep learning assistant on the histopathologic classification of liver cancer.深度学习助手对肝癌组织病理学分类的影响。

NPJ Digit Med. 2020 Feb 26;3:23. doi: 10.1038/s41746-020-0232-8. eCollection 2020.

Calibration: the Achilles heel of predictive analytics.校准：预测分析的阿喀琉斯之踵。

BMC Med. 2019 Dec 16;17(1):230. doi: 10.1186/s12916-019-1466-7.

Three myths about risk thresholds for prediction models.关于预测模型风险阈值的三个误区。

BMC Med. 2019 Oct 25;17(1):192. doi: 10.1186/s12916-019-1425-3.

A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models.系统评价显示，机器学习在临床预测模型中并未优于逻辑回归。

J Clin Epidemiol. 2019 Jun;110:12-22. doi: 10.1016/j.jclinepi.2019.02.004. Epub 2019 Feb 11.

Using simulation studies to evaluate statistical methods.运用模拟研究评估统计方法。

Stat Med. 2019 May 20;38(11):2074-2102. doi: 10.1002/sim.8086. Epub 2019 Jan 16.

Sample size for binary logistic prediction models: Beyond events per variable criteria.二项逻辑预测模型的样本量：超越变量标准的事件数。

Stat Methods Med Res. 2019 Aug;28(8):2455-2474. doi: 10.1177/0962280218784726. Epub 2018 Jul 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

类别不平衡校正对风险预测模型的危害：使用逻辑回归进行说明和模拟。

The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression.

机构信息

出版信息

OBJECTIVE

MATERIAL AND METHODS

RESULTS

DISCUSSION

CONCLUSION

目的

材料和方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献