文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

针对代表性不足患者的亚群特异性机器学习预后分析及双重优先偏差校正

Subpopulation-specific machine learning prognosis for underrepresented patients with double prioritized bias correction.

作者信息

Afrose Sharmin, Song Wenjia, Nemeroff Charles B, Lu Chang, Yao Danfeng Daphne

机构信息

Department of Computer Science, Virginia Tech, Blacksburg, VA USA.

Department of Psychiatry and Behavioral Sciences, The University of Texas at Austin Dell Medical School, Austin, TX USA.

出版信息

Commun Med (Lond). 2022 Sep 1;2:111. doi: 10.1038/s43856-022-00165-w. eCollection 2022.


DOI:10.1038/s43856-022-00165-w
PMID:36059892
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9436942/
Abstract

BACKGROUND: Many clinical datasets are intrinsically imbalanced, dominated by overwhelming majority groups. Off-the-shelf machine learning models that optimize the prognosis of majority patient types (e.g., healthy class) may cause substantial errors on the minority prediction class (e.g., disease class) and demographic subgroups (e.g., Black or young patients). In the typical one-machine-learning-model-fits-all paradigm, racial and age disparities are likely to exist, but unreported. In addition, some widely used whole-population metrics give misleading results. METHODS: We design a double prioritized (DP) bias correction technique to mitigate representational biases in machine learning-based prognosis. Our method trains customized machine learning models for specific ethnicity or age groups, a substantial departure from the one-model-predicts-all convention. We compare with other sampling and reweighting techniques in mortality and cancer survivability prediction tasks. RESULTS: We first provide empirical evidence showing various prediction deficiencies in a typical machine learning setting without bias correction. For example, missed death cases are 3.14 times higher than missed survival cases for mortality prediction. Then, we show DP consistently boosts the minority class recall for underrepresented groups, by up to 38.0%. DP also reduces relative disparities across race and age groups, e.g., up to 88.0% better than the 8 existing sampling solutions in terms of the relative disparity of minority class recall. Cross-race and cross-age-group evaluation also suggests the need for subpopulation-specific machine learning models. CONCLUSIONS: Biases exist in the widely accepted one-machine-learning-model-fits-all-population approach. We invent a bias correction method that produces specialized machine learning prognostication models for underrepresented racial and age groups. This technique may reduce potentially life-threatening prediction mistakes for minority populations.

摘要

背景:许多临床数据集本质上是不平衡的,由绝大多数群体主导。优化大多数患者类型(如健康类别)预后的现成机器学习模型可能在少数预测类别(如疾病类别)和人口亚组(如黑人或年轻患者)上导致重大错误。在典型的一个机器学习模型适用于所有情况的范式中,种族和年龄差异可能存在,但未被报告。此外,一些广泛使用的全人群指标会给出误导性结果。 方法:我们设计了一种双重优先(DP)偏差校正技术,以减轻基于机器学习的预后中的代表性偏差。我们的方法针对特定种族或年龄组训练定制的机器学习模型,这与一个模型预测所有情况的传统方法有很大不同。我们在死亡率和癌症生存率预测任务中与其他采样和重新加权技术进行比较。 结果:我们首先提供了经验证据,表明在没有偏差校正的典型机器学习设置中存在各种预测缺陷。例如,在死亡率预测中,漏报的死亡病例比漏报的存活病例高3.14倍。然后,我们表明DP持续提高了代表性不足群体的少数类召回率,最高可达38.0%。DP还减少了种族和年龄组之间的相对差异,例如,在少数类召回率的相对差异方面,比现有的8种采样解决方案高出88.ness="50%"> 结论:广泛接受的一个机器学习模型适用于所有人群的方法存在偏差。我们发明了一种偏差校正方法,为代表性不足的种族和年龄组生成专门的机器学习预后模型。这项技术可能会减少对少数群体潜在的危及生命的预测错误。

相似文献

[1]
Subpopulation-specific machine learning prognosis for underrepresented patients with double prioritized bias correction.

Commun Med (Lond). 2022-9-1

[2]
Machine Learning Strategies for Improved Phenotype Prediction in Underrepresented Populations.

bioRxiv. 2023-10-17

[3]
Machine Learning Strategies for Improved Phenotype Prediction in Underrepresented Populations.

Pac Symp Biocomput. 2024

[4]
Effect of machine learning re-sampling techniques for imbalanced datasets in F-FDG PET-based radiomics model on prognostication performance in cohorts of head and neck cancer patients.

Eur J Nucl Med Mol Imaging. 2020-11

[5]
Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project.

BMC Med Inform Decis Mak. 2017-12-19

[6]
A Racially Unbiased, Machine Learning Approach to Prediction of Mortality: Algorithm Development Study.

JMIR Public Health Surveill. 2020-10-22

[7]
Stroke Prediction with Machine Learning Methods among Older Chinese.

Int J Environ Res Public Health. 2020-3-12

[8]
Trans-Balance: Reducing demographic disparity for prediction models in the presence of class imbalance.

J Biomed Inform. 2024-1

[9]
Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.

JAMA Netw Open. 2021-4-1

[10]
Evaluation and Mitigation of Racial Bias in Clinical Machine Learning Models: Scoping Review.

JMIR Med Inform. 2022-5-31

引用本文的文献

[1]
A scoping review and evidence gap analysis of clinical AI fairness.

NPJ Digit Med. 2025-6-14

[2]
Failure modes and mitigations for Bayesian optimization of neuromodulation parameters.

J Neural Eng. 2025-6-13

[3]
The illusion of safety: A report to the FDA on AI healthcare product approvals.

PLOS Digit Health. 2025-6-5

[4]
Predicting therapeutic clinical trial enrollment for adult patients with low- and high-grade glioma using supervised machine learning.

Sci Adv. 2025-6-6

[5]
Status and opportunities of machine learning applications in obstructive sleep apnea: A narrative review.

Comput Struct Biotechnol J. 2025-4-25

[6]
Low responsiveness of machine learning models to critical or deteriorating health conditions.

Commun Med (Lond). 2025-3-11

[7]
Status and Opportunities of Machine Learning Applications in Obstructive Sleep Apnea: A Narrative Review.

medRxiv. 2025-5-10

[8]
Examining inclusivity: the use of AI and diverse populations in health and social care: a systematic review.

BMC Med Inform Decis Mak. 2025-2-5

[9]
Application of machine learning techniques for warfarin dosage prediction: a case study on the MIMIC-III dataset.

PeerJ Comput Sci. 2025-1-2

[10]
Survey and perspective on verification, validation, and uncertainty quantification of digital twins for precision medicine.

NPJ Digit Med. 2025-1-17

本文引用的文献

[1]
A Shallow Convolutional Neural Network Predicts Prognosis of Lung Cancer Patients in Multi-Institutional CT-Image Data.

Nat Mach Intell. 2020-5

[2]
Diagnosis and risk stratification in hypertrophic cardiomyopathy using machine learning wall thickness measurement: a comparison with human test-retest performance.

Lancet Digit Health. 2021-1

[3]
Predicting the risk of developing diabetic retinopathy using deep learning.

Lancet Digit Health. 2021-1

[4]
Temporal bias in case-control design: preventing reliable predictions of the future.

Nat Commun. 2021-2-17

[5]
An algorithmic approach to reducing unexplained pain disparities in underserved populations.

Nat Med. 2021-1

[6]
Dynamic ElecTronic hEalth reCord deTection (DETECT) of individuals at risk of a first episode of psychosis: a case-control development and validation study.

Lancet Digit Health. 2020-5

[7]
Time to reality check the promises of machine learning-powered precision medicine.

Lancet Digit Health. 2020-12

[8]
Evaluating the effect of demographic factors, socioeconomic factors, and risk aversion on mobility during the COVID-19 epidemic in France under lockdown: a population-based study.

Lancet Digit Health. 2020-10-28

[9]
Dissecting racial bias in an algorithm used to manage the health of populations.

Science. 2019-10-25

[10]
Multitask learning and benchmarking with clinical time series data.

Sci Data. 2019-6-17

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索