• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

比较降低产后抑郁临床预测模型偏倚的方法。

Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.

机构信息

Center for Computational Health, IBM Research, Cambridge, Massachusetts.

Center for Computational Health, IBM TJ Watson Research Center, Yorktown Heights, NY.

出版信息

JAMA Netw Open. 2021 Apr 1;4(4):e213909. doi: 10.1001/jamanetworkopen.2021.3909.

DOI:10.1001/jamanetworkopen.2021.3909
PMID:33856478
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8050742/
Abstract

IMPORTANCE

The lack of standards in methods to reduce bias for clinical algorithms presents various challenges in providing reliable predictions and in addressing health disparities.

OBJECTIVE

To evaluate approaches for reducing bias in machine learning models using a real-world clinical scenario.

DESIGN, SETTING, AND PARTICIPANTS: Health data for this cohort study were obtained from the IBM MarketScan Medicaid Database. Eligibility criteria were as follows: (1) Female individuals aged 12 to 55 years with a live birth record identified by delivery-related codes from January 1, 2014, through December 31, 2018; (2) greater than 80% enrollment through pregnancy to 60 days post partum; and (3) evidence of coverage for depression screening and mental health services. Statistical analysis was performed in 2020.

EXPOSURES

Binarized race (Black individuals and White individuals).

MAIN OUTCOMES AND MEASURES

Machine learning models (logistic regression [LR], random forest, and extreme gradient boosting) were trained for 2 binary outcomes: postpartum depression (PPD) and postpartum mental health service utilization. Risk-adjusted generalized linear models were used for each outcome to assess potential disparity in the cohort associated with binarized race (Black or White). Methods for reducing bias, including reweighing, Prejudice Remover, and removing race from the models, were examined by analyzing changes in fairness metrics compared with the base models. Baseline characteristics of female individuals at the top-predicted risk decile were compared for systematic differences. Fairness metrics of disparate impact (DI, 1 indicates fairness) and equal opportunity difference (EOD, 0 indicates fairness).

RESULTS

Among 573 634 female individuals initially examined for this study, 314 903 were White (54.9%), 217 899 were Black (38.0%), and the mean (SD) age was 26.1 (5.5) years. The risk-adjusted odds ratio comparing White participants with Black participants was 2.06 (95% CI, 2.02-2.10) for clinically recognized PPD and 1.37 (95% CI, 1.33-1.40) for postpartum mental health service utilization. Taking the LR model for PPD prediction as an example, reweighing reduced bias as measured by improved DI and EOD metrics from 0.31 and -0.19 to 0.79 and 0.02, respectively. Removing race from the models had inferior performance for reducing bias compared with the other methods (PPD: DI = 0.61; EOD = -0.05; mental health service utilization: DI = 0.63; EOD = -0.04).

CONCLUSIONS AND RELEVANCE

Clinical prediction models trained on potentially biased data may produce unfair outcomes on the basis of the chosen metrics. This study's results suggest that the performance varied depending on the model, outcome label, and method for reducing bias. This approach toward evaluating algorithmic bias can be used as an example for the growing number of researchers who wish to examine and address bias in their data and models.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/526c/8050742/a628538b8d51/jamanetwopen-e213909-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/526c/8050742/e61e8262a157/jamanetwopen-e213909-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/526c/8050742/a628538b8d51/jamanetwopen-e213909-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/526c/8050742/e61e8262a157/jamanetwopen-e213909-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/526c/8050742/a628538b8d51/jamanetwopen-e213909-g002.jpg
摘要

重要性

临床算法中减少偏差的方法缺乏标准,这在提供可靠预测和解决健康差异方面带来了各种挑战。

目的

使用真实临床场景评估机器学习模型中减少偏差的方法。

设计、设置和参与者:本队列研究的健康数据来自 IBM MarketScan 医疗补助数据库。入选标准如下:(1)12 至 55 岁的女性个体,通过分娩相关代码确定有活产记录,时间范围为 2014 年 1 月 1 日至 2018 年 12 月 31 日;(2)妊娠至产后 60 天期间 80%以上时间入组;(3)有抑郁筛查和心理健康服务的覆盖证据。统计分析于 2020 年进行。

暴露因素

二值化种族(黑人个体和白人个体)。

主要结局和措施

为 2 个二分类结局(产后抑郁症[PPD]和产后心理健康服务利用)训练机器学习模型(逻辑回归[LR]、随机森林和极端梯度增强)。使用每个结局的风险调整广义线性模型评估与二值化种族(黑人或白人)相关的队列中潜在的差异。通过分析与基础模型相比公平性指标的变化,研究了减少偏差的方法,包括重新加权、偏见消除和从模型中去除种族。比较了处于最高预测风险十分位数的女性个体的基线特征是否存在系统性差异。公平性指标包括差异影响(DI,1 表示公平)和均等机会差异(EOD,0 表示公平)。

结果

在最初纳入本研究的 573634 名女性个体中,314903 名(54.9%)为白人,217899 名(38.0%)为黑人,平均(SD)年龄为 26.1(5.5)岁。与白人参与者相比,风险调整后的比值比显示,黑人参与者发生临床识别的 PPD 的比值为 2.06(95%CI,2.02-2.10),产后心理健康服务利用率的比值为 1.37(95%CI,1.33-1.40)。以 LR 模型预测 PPD 为例,重新加权可改善公平性指标,使 DI 和 EOD 分别从 0.31 和-0.19 提高至 0.79 和 0.02。与其他方法相比,从模型中去除种族会降低减少偏差的效果(PPD:DI=0.61;EOD=-0.05;心理健康服务利用率:DI=0.63;EOD=-0.04)。

结论和相关性

基于所选指标,在潜在存在偏差的数据上训练的临床预测模型可能会产生不公平的结果。本研究结果表明,不同方法的模型、结局标签和减少偏差的性能不同。这种评估算法偏差的方法可以作为越来越多希望检查和解决数据和模型中偏差的研究人员的一个范例。

相似文献

1
Comparison of Methods to Reduce Bias From Clinical Prediction Models of Postpartum Depression.比较降低产后抑郁临床预测模型偏倚的方法。
JAMA Netw Open. 2021 Apr 1;4(4):e213909. doi: 10.1001/jamanetworkopen.2021.3909.
2
Evaluating and mitigating bias in machine learning models for cardiovascular disease prediction.评估和减轻心血管疾病预测机器学习模型中的偏差。
J Biomed Inform. 2023 Feb;138:104294. doi: 10.1016/j.jbi.2023.104294. Epub 2023 Jan 24.
3
Fairness in Predicting Cancer Mortality Across Racial Subgroups.预测不同种族亚组癌症死亡率的公平性。
JAMA Netw Open. 2024 Jul 1;7(7):e2421290. doi: 10.1001/jamanetworkopen.2024.21290.
4
Algorithmic Fairness of Machine Learning Models for Alzheimer Disease Progression.机器学习模型在阿尔茨海默病进展中的算法公平性。
JAMA Netw Open. 2023 Nov 1;6(11):e2342203. doi: 10.1001/jamanetworkopen.2023.42203.
5
Preparing for the bedside-optimizing a postpartum depression risk prediction model for clinical implementation in a health system.为床边准备-优化产后抑郁症风险预测模型,以在卫生系统中临床实施。
J Am Med Inform Assoc. 2024 May 20;31(6):1258-1267. doi: 10.1093/jamia/ocae056.
6
Neighborhood Disadvantage, Race and Ethnicity, and Postpartum Depression.邻里劣势、种族和民族与产后抑郁症。
JAMA Netw Open. 2023 Nov 1;6(11):e2342398. doi: 10.1001/jamanetworkopen.2023.42398.
7
Association of Primary Intracerebral Hemorrhage With Pregnancy and the Postpartum Period.原发性脑出血与妊娠及产后期的相关性。
JAMA Netw Open. 2020 Apr 1;3(4):e202769. doi: 10.1001/jamanetworkopen.2020.2769.
8
Racial Differences in Postpartum Blood Pressure Trajectories Among Women After a Hypertensive Disorder of Pregnancy.妊娠高血压疾病后女性产后血压轨迹的种族差异。
JAMA Netw Open. 2020 Dec 1;3(12):e2030815. doi: 10.1001/jamanetworkopen.2020.30815.
9
Postpartum Mental Health and Breastfeeding Practices: An Analysis Using the 2010-2011 Pregnancy Risk Assessment Monitoring System.产后心理健康与母乳喂养行为:基于2010 - 2011年妊娠风险评估监测系统的分析
Matern Child Health J. 2017 Mar;21(3):636-647. doi: 10.1007/s10995-016-2150-6.
10
Evaluating Algorithmic Bias in 30-Day Hospital Readmission Models: Retrospective Analysis.评估 30 天内医院再入院模型中的算法偏差:回顾性分析。
J Med Internet Res. 2024 Apr 18;26:e47125. doi: 10.2196/47125.

引用本文的文献

1
Explainable mortality prediction models incorporating social health determinants and physical frailty for heart failure patients.纳入社会健康决定因素和身体虚弱因素的心力衰竭患者可解释死亡率预测模型。
PLoS One. 2025 Sep 3;20(9):e0327979. doi: 10.1371/journal.pone.0327979. eCollection 2025.
2
Empirical Comparison of Post-processing Debiasing Methods for Machine Learning Classifiers in Healthcare.医疗保健领域机器学习分类器后处理去偏方法的实证比较
J Healthc Inform Res. 2025 Mar 20;9(3):465-493. doi: 10.1007/s41666-025-00196-7. eCollection 2025 Sep.
3
Operationalization of Artificial Intelligence Applications in the Intensive Care Unit: A Systematic Review.

本文引用的文献

1
Ethical limitations of algorithmic fairness solutions in health care machine learning.医疗保健机器学习中算法公平性解决方案的伦理局限性
Lancet Digit Health. 2020 May;2(5):e221-e223. doi: 10.1016/S2589-7500(20)30065-0.
2
Hidden in Plain Sight - Reconsidering the Use of Race Correction in Clinical Algorithms.隐匿于众目睽睽之下——重新审视临床算法中种族校正的应用
N Engl J Med. 2020 Aug 27;383(9):874-882. doi: 10.1056/NEJMms2004740. Epub 2020 Jun 17.
3
Racial disparities in automated speech recognition.种族差异与自动化语音识别。
重症监护病房中人工智能应用的实施:一项系统综述。
JAMA Netw Open. 2025 Jul 1;8(7):e2522866. doi: 10.1001/jamanetworkopen.2025.22866.
4
Clinical Algorithms and the Legacy of Race-Based Correction: Historical Errors, Contemporary Revisions and Equity-Oriented Methodologies for Epidemiologists.临床算法与基于种族校正的遗产:历史错误、当代修订以及面向公平的流行病学家方法学
Clin Epidemiol. 2025 Jul 12;17:647-662. doi: 10.2147/CLEP.S527000. eCollection 2025.
5
AI-driven multimodal colorimetric analytics for biomedical and behavioral health diagnostics.用于生物医学和行为健康诊断的人工智能驱动的多模态比色分析
Comput Struct Biotechnol J. 2025 May 28;27:2219-2232. doi: 10.1016/j.csbj.2025.05.015. eCollection 2025.
6
A scoping review and evidence gap analysis of clinical AI fairness.临床人工智能公平性的范围综述与证据差距分析
NPJ Digit Med. 2025 Jun 14;8(1):360. doi: 10.1038/s41746-025-01667-2.
7
Disparate Model Performance and Stability in Machine Learning Clinical Support for Diabetes and Heart Diseases.机器学习在糖尿病和心脏病临床支持中的不同模型性能与稳定性
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:95-104. eCollection 2025.
8
Assessing Algorithmic Fairness With a Multimodal Artificial Intelligence Model in Men of African and Non-African Origin on NRG Oncology Prostate Cancer Phase III Trials.在NRG肿瘤学前列腺癌III期试验中,使用多模态人工智能模型评估非洲裔和非非洲裔男性的算法公平性。
JCO Clin Cancer Inform. 2025 May;9:e2400284. doi: 10.1200/CCI-24-00284. Epub 2025 May 9.
9
The imperative of diversity and equity for the adoption of responsible AI in healthcare.在医疗保健领域采用负责任的人工智能时,多样性和公平性的必要性。
Front Artif Intell. 2025 Apr 16;8:1577529. doi: 10.3389/frai.2025.1577529. eCollection 2025.
10
Development and validation of a novel risk assessment model for accurate prediction of intraoperative hypothermia in adult patients undergoing different types of surgery: insights from a multicentre, retrospective cohort study.一种用于准确预测接受不同类型手术的成年患者术中低体温的新型风险评估模型的开发与验证:一项多中心回顾性队列研究的见解
Ann Med. 2025 Dec;57(1):2489749. doi: 10.1080/07853890.2025.2489749. Epub 2025 Apr 12.
Proc Natl Acad Sci U S A. 2020 Apr 7;117(14):7684-7689. doi: 10.1073/pnas.1915768117. Epub 2020 Mar 23.
4
The long road to fairer algorithms.通往更公平算法的漫长道路。
Nature. 2020 Feb;578(7793):34-36. doi: 10.1038/d41586-020-00274-3.
5
Dissecting racial bias in an algorithm used to manage the health of populations.剖析用于管理人群健康的算法中的种族偏见。
Science. 2019 Oct 25;366(6464):447-453. doi: 10.1126/science.aax2342.
6
Ensuring Fairness in Machine Learning to Advance Health Equity.确保机器学习的公正性,以促进健康公平。
Ann Intern Med. 2018 Dec 18;169(12):866-872. doi: 10.7326/M18-1990. Epub 2018 Dec 4.
7
Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data.利用电子健康记录数据的机器学习算法中的潜在偏差。
JAMA Intern Med. 2018 Nov 1;178(11):1544-1547. doi: 10.1001/jamainternmed.2018.3763.
8
Good intentions are not enough: how informatics interventions can worsen inequality.好心未必有好报:信息学干预措施如何加剧不平等。
J Am Med Inform Assoc. 2018 Aug 1;25(8):1080-1088. doi: 10.1093/jamia/ocy052.
9
On the causal interpretation of race in regressions adjusting for confounding and mediating variables.关于在对混杂变量和中介变量进行调整的回归分析中种族的因果解释
Epidemiology. 2014 Jul;25(4):473-84. doi: 10.1097/EDE.0000000000000105.
10
Rates and predictors of postpartum depression by race and ethnicity: results from the 2004 to 2007 New York City PRAMS survey (Pregnancy Risk Assessment Monitoring System).按种族和民族划分的产后抑郁症发生率和预测因素:来自 2004 年至 2007 年纽约市 PRAMS 调查(妊娠风险评估监测系统)的结果。
Matern Child Health J. 2013 Nov;17(9):1599-610. doi: 10.1007/s10995-012-1171-z.