结局类别不平衡和罕见事件：药物过量风险预测建模中被低估的复杂情况。

Outcome class imbalance and rare events: An underappreciated complication for overdose risk prediction modeling.

机构信息

Department of Epidemiology, Brown University School of Public Health, Providence, Rhode Island, USA.

Department of Emergency Medicine, Alpert Medical School of Brown University, Providence, Rhode Island, USA.

出版信息

Addiction. 2023 Jun;118(6):1167-1176. doi: 10.1111/add.16133. Epub 2023 Feb 6.

DOI:10.1111/add.16133

PMID:36683137

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10175167/

Abstract

BACKGROUND AND AIMS

Low outcome prevalence, often observed with opioid-related outcomes, poses an underappreciated challenge to accurate predictive modeling. Outcome class imbalance, where non-events (i.e. negative class observations) outnumber events (i.e. positive class observations) by a moderate to extreme degree, can distort measures of predictive accuracy in misleading ways, and make the overall predictive accuracy and the discriminatory ability of a predictive model appear spuriously high. We conducted a simulation study to measure the impact of outcome class imbalance on predictive performance of a simple SuperLearner ensemble model and suggest strategies for reducing that impact.

DESIGN, SETTING, PARTICIPANTS: Using a Monte Carlo design with 250 repetitions, we trained and evaluated these models on four simulated data sets with 100 000 observations each: one with perfect balance between events and non-events, and three where non-events outnumbered events by an approximate factor of 10:1, 100:1, and 1000:1, respectively.

MEASUREMENTS

We evaluated the performance of these models using a comprehensive suite of measures, including measures that are more appropriate for imbalanced data.

FINDINGS

Increasing imbalance tended to spuriously improve overall accuracy (using a high threshold to classify events vs non-events, overall accuracy improved from 0.45 with perfect balance to 0.99 with the most severe outcome class imbalance), but diminished predictive performance was evident using other metrics (corresponding positive predictive value decreased from 0.99 to 0.14).

CONCLUSION

Increasing reliance on algorithmic risk scores in consequential decision-making processes raises critical fairness and ethical concerns. This paper provides broad guidance for analytic strategies that clinical investigators can use to remedy the impacts of outcome class imbalance on risk prediction tools.

摘要

背景和目的

低结局发生率在与阿片类药物相关的结局中经常观察到，这对准确的预测建模构成了一个未被充分认识的挑战。结局类别不平衡，即无事件（即负类观察）比事件（即正类观察）多到中等至极端程度，会以误导的方式扭曲预测准确性的度量，并使预测模型的整体预测准确性和区分能力看起来虚假地高。我们进行了一项模拟研究，以衡量结局类别不平衡对简单 SuperLearner 集成模型预测性能的影响，并提出了减少这种影响的策略。

设计、设置、参与者：使用具有 250 次重复的蒙特卡罗设计，我们在四个模拟数据集上训练和评估了这些模型，每个数据集有 100000 个观察值：一个数据集在事件和非事件之间具有完美的平衡，另外三个数据集中非事件的数量比事件多约 10:1、100:1 和 1000:1。

测量

我们使用了一整套评估这些模型的性能的测量方法，包括更适合不平衡数据的测量方法。

发现

随着不平衡程度的增加，整体准确性（使用高阈值来对事件和非事件进行分类，整体准确性从平衡时的 0.45 提高到最严重的结局类别不平衡时的 0.99）会虚假地提高，但使用其他指标时，预测性能明显下降（相应的阳性预测值从 0.99 下降到 0.14）。

结论

在重要的决策过程中越来越依赖算法风险评分，引起了关键的公平和伦理问题。本文为临床研究者可以用来纠正结局类别不平衡对风险预测工具影响的分析策略提供了广泛的指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f265/10175167/dd494beaff68/nihms-1868769-f0001.jpg

相似文献

Outcome class imbalance and rare events: An underappreciated complication for overdose risk prediction modeling.结局类别不平衡和罕见事件：药物过量风险预测建模中被低估的复杂情况。

Addiction. 2023 Jun;118(6):1167-1176. doi: 10.1111/add.16133. Epub 2023 Feb 6.

Evaluation of Machine-Learning Algorithms for Predicting Opioid Overdose Risk Among Medicare Beneficiaries With Opioid Prescriptions.评估机器学习算法在预测有阿片类药物处方的医疗保险受益人群中阿片类药物过量风险中的应用。

JAMA Netw Open. 2019 Mar 1;2(3):e190968. doi: 10.1001/jamanetworkopen.2019.0968.

Class imbalance should not throw you off balance: Choosing the right classifiers and performance metrics for brain decoding with imbalanced data.不要被类别不平衡问题困扰：选择合适的分类器和性能指标，对不平衡数据进行脑解码。

Neuroimage. 2023 Aug 15;277:120253. doi: 10.1016/j.neuroimage.2023.120253. Epub 2023 Jun 28.

Trans-Balance: Reducing demographic disparity for prediction models in the presence of class imbalance.跨平衡：在存在类别不平衡的情况下减少预测模型的人口统计学差异。

J Biomed Inform. 2024 Jan;149:104532. doi: 10.1016/j.jbi.2023.104532. Epub 2023 Dec 7.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗？

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression.类别不平衡校正对风险预测模型的危害：使用逻辑回归进行说明和模拟。

J Am Med Inform Assoc. 2022 Aug 16;29(9):1525-1534. doi: 10.1093/jamia/ocac093.

Using machine learning to study the effect of medication adherence in Opioid Use Disorder.利用机器学习研究阿片类药物使用障碍中药物依从性的影响。

PLoS One. 2022 Dec 15;17(12):e0278988. doi: 10.1371/journal.pone.0278988. eCollection 2022.

Class prediction for high-dimensional class-imbalanced data.高维类别不平衡数据的类别预测。

BMC Bioinformatics. 2010 Oct 20;11:523. doi: 10.1186/1471-2105-11-523.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者？

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

Screening for the primary prevention of fragility fractures among adults aged 40 years and older in primary care: systematic reviews of the effects and acceptability of screening and treatment, and the accuracy of risk prediction tools.40 岁及以上成年人在初级保健中进行脆性骨折一级预防的筛查：筛查和治疗效果及可接受性以及风险预测工具准确性的系统评价。

Syst Rev. 2023 Mar 21;12(1):51. doi: 10.1186/s13643-023-02181-w.

引用本文的文献

Population-level individualized prospective prediction of opioid overdose using machine learning.使用机器学习进行阿片类药物过量的人群水平个体化前瞻性预测。

Mol Psychiatry. 2025 Apr 14. doi: 10.1038/s41380-025-02992-4.

Predictive modeling of methadone poisoning outcomes in children ≤ 5 years: utilizing machine learning and the National Poison Data System for improved clinical decision-making.5岁及以下儿童美沙酮中毒结局的预测模型：利用机器学习和国家中毒数据系统改善临床决策

Eur J Pediatr. 2025 Feb 11;184(2):186. doi: 10.1007/s00431-024-05957-x.

Design and development of a machine-learning-driven opioid overdose risk prediction tool integrated in electronic health records in primary care settings.在初级保健环境中集成于电子健康记录的机器学习驱动的阿片类药物过量风险预测工具的设计与开发。

Bioelectron Med. 2024 Oct 18;10(1):24. doi: 10.1186/s42234-024-00156-3.

Predicting adverse outcomes in adults with a community-acquired lower respiratory tract infection: a protocol for the development and validation of two prediction models for (i) all-cause hospitalisation and mortality and (ii) cardiovascular outcomes.预测社区获得性下呼吸道感染成人患者的不良结局：关于开发和验证两个预测模型的方案，这两个模型分别用于预测（i）全因住院和死亡率以及（ii）心血管结局。

Diagn Progn Res. 2023 Dec 7;7(1):23. doi: 10.1186/s41512-023-00161-1.

本文引用的文献

Reporting guidelines for health care simulation research: Extensions to the CONSORT and STROBE statements.医疗保健模拟研究报告指南：CONSORT和STROBE声明的扩展

BMJ Simul Technol Enhanc Learn. 2016 Jul 24;2(3):51-60. doi: 10.1136/bmjstel-2016-000124. eCollection 2016.

The class imbalance problem.类别不平衡问题。

Nat Methods. 2021 Nov;18(11):1270-1272. doi: 10.1038/s41592-021-01302-4.

Comparison of Characteristics of Deaths From Drug Overdose Before vs During the COVID-19 Pandemic in Rhode Island.与新冠疫情前相比，罗德岛药物过量死亡特征的比较。

JAMA Netw Open. 2021 Sep 1;4(9):e2125538. doi: 10.1001/jamanetworkopen.2021.25538.

Association of Dose Tapering With Overdose or Mental Health Crisis Among Patients Prescribed Long-term Opioids.长期服用阿片类药物患者中剂量递减与过量用药或心理健康危机的关联。

JAMA. 2021 Aug 3;326(5):411-419. doi: 10.1001/jama.2021.11013.

Predicting opioid use disorder and associated risk factors in a Medicaid managed care population.预测医疗补助管理式医疗人群中的阿片类药物使用障碍及相关风险因素。

Am J Manag Care. 2021 Apr;27(4):148-154. doi: 10.37765/ajmc.2021.88617.

Assessing opioid overdose risk: a review of clinical prediction models utilizing patient-level data.评估阿片类药物过量风险：利用患者水平数据的临床预测模型综述。

Transl Res. 2021 Aug;234:74-87. doi: 10.1016/j.trsl.2021.03.012. Epub 2021 Mar 21.

Predicting opioid overdose risk of patients with opioid prescriptions using electronic health records based on temporal deep learning.基于时间深度学习的电子健康记录预测阿片类药物处方患者的阿片类药物过量风险。

J Biomed Inform. 2021 Apr;116:103725. doi: 10.1016/j.jbi.2021.103725. Epub 2021 Mar 9.

Predictors of long-term use of prescription opioids in the community-dwelling population of adults without a cancer diagnosis: a retrospective cohort study.无癌症诊断的社区居住成年人群中长期使用处方类阿片的预测因素：一项回顾性队列研究。

CMAJ Open. 2021 Feb 9;9(1):E96-E106. doi: 10.9778/cmajo.20200076. Print 2021 Jan-Mar.

Accelerated Overdose Deaths Linked With COVID-19.与新冠病毒相关的加速过量用药死亡

JAMA. 2021 Feb 9;325(6):523. doi: 10.1001/jama.2021.0074.

Performance of a Predictive Model versus Prescription-Based Thresholds in Identifying Patients at Risk of Fatal Opioid Overdose.预测模型与基于处方阈值在识别致命阿片类药物过量风险患者中的表现比较。

Subst Use Misuse. 2021;56(3):396-403. doi: 10.1080/10826084.2020.1868520. Epub 2021 Jan 15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验