Suppr超能文献

利用亚组可学习性检测、表征和减轻医疗保健数据集中的隐性和显性种族偏见:算法开发与验证研究

Detecting, Characterizing, and Mitigating Implicit and Explicit Racial Biases in Health Care Datasets With Subgroup Learnability: Algorithm Development and Validation Study.

作者信息

Gulamali Faris, Sawant Ashwin Shreekant, Liharska Lora, Horowitz Carol, Chan Lili, Hofer Ira, Singh Karandeep, Richardson Lynne, Mensah Emmanuel, Charney Alexander, Reich David, Hu Jianying, Nadkarni Girish

机构信息

Icahn School of Medicine at Mount Sinai, 1468 Madison Avenue, New York, NY, 10029, United States, 1 2122416500.

University of California, San Diego, San Diego, CA, United States.

出版信息

J Med Internet Res. 2025 Sep 4;27:e71757. doi: 10.2196/71757.

Abstract

BACKGROUND

The growing adoption of diagnostic and prognostic algorithms in health care has led to concerns about the perpetuation of algorithmic bias against disadvantaged groups of individuals. Deep learning methods to detect and mitigate bias have revolved around modifying models, optimization strategies, and threshold calibration with varying levels of success and tradeoffs. However, there have been limited substantive efforts to address bias at the level of the data used to generate algorithms in health care datasets.

OBJECTIVE

The aim of this study is to create a simple metric (AEquity) that uses a learning curve approximation to distinguish and mitigate bias via guided dataset collection or relabeling.

METHODS

We demonstrate this metric in 2 well-known examples, chest X-rays and health care cost utilization, and detect novel biases in the National Health and Nutrition Examination Survey.

RESULTS

We demonstrated that using AEquity to guide data-centric collection for each diagnostic finding in the chest radiograph dataset decreased bias by between 29% and 96.5% when measured by differences in area under the curve. Next, we wanted to examine (1) whether AEquity worked on intersectional populations and (2) if AEquity is invariant to different types of fairness metrics, not just area under the curve. Subsequently, we examined the effect of AEquity on mitigating bias when measured by false negative rate, precision, and false discovery rate for Black patients on Medicaid. When we examined Black patients on Medicaid, at the intersection of race and socioeconomic status, we found that AEquity-based interventions reduced bias across a number of different fairness metrics including overall false negative rate by 33.3% (bias reduction absolute=1.88×10-1, 95% CI 1.4×10-1 to 2.5×10-1; bias reduction of 33.3%, 95% CI 26.6%-40%; precision bias by 7.50×10-2, 95% CI 7.48×10-2 to 7.51×10-2; bias reduction of 94.6%, 95% CI 94.5%-94.7%; false discovery rate by 94.5%; absolute bias reduction=3.50×10-2, 95% CI 3.49×10-2 to 3.50×10-2). Similarly, AEquity-guided data collection demonstrated bias reduction of up to 80% on mortality prediction with the National Health and Nutrition Examination Survey (bias reduction absolute=0.08, 95% CI 0.07-0.09). Then, we wanted to compare AEquity to state-of-the-art data-guided debiasing measures such as balanced empirical risk minimization and calibration. Consequently, we benchmarked against balanced empirical risk minimization and calibration and showed that AEquity-guided data collection outperforms both standard approaches. Moreover, we demonstrated that AEquity works on fully connected networks; convolutional neural networks such as ResNet-50; transformer architectures such as VIT-B-16, a vision transformer with 86 million parameters; and nonparametric methods such as Light Gradient-Boosting Machine.

CONCLUSIONS

In short, we demonstrated that AEquity is a robust tool by applying it to different datasets, algorithms, and intersectional analyses and measuring its effectiveness with respect to a range of traditional fairness metrics.

摘要

背景

医疗保健领域中诊断和预后算法的日益采用引发了人们对算法对弱势群体的偏见持续存在的担忧。用于检测和减轻偏见的深度学习方法主要围绕修改模型、优化策略和阈值校准展开,取得了不同程度的成功并存在权衡。然而,在用于生成医疗保健数据集算法的数据层面,解决偏见的实质性努力有限。

目的

本研究的目的是创建一个简单的指标(AEquity),该指标使用学习曲线近似法,通过有指导的数据集收集或重新标记来区分和减轻偏见。

方法

我们在胸部X射线和医疗保健成本利用这两个著名的例子中展示了这个指标,并在国家健康与营养检查调查中检测到了新的偏见。

结果

我们证明,在胸部X光片数据集中,使用AEquity指导以数据为中心的每个诊断结果的收集,通过曲线下面积差异衡量,偏见降低了29%至96.5%。接下来,我们想研究(1)AEquity是否适用于交叉群体,以及(2)AEquity是否对不同类型的公平性指标不变,而不仅仅是曲线下面积。随后,我们研究了以医疗补助计划中的黑人患者的假阴性率、精度和错误发现率衡量时,AEquity对减轻偏见的影响。当我们研究医疗补助计划中的黑人患者,在种族和社会经济地位交叉点时,我们发现基于AEquity的干预措施在许多不同的公平性指标上减少了偏见,包括总体假阴性率降低了33.3%(偏见减少绝对值 = 1.88×10 - 1,95%置信区间1.4×10 - 1至2.5×10 - 1;偏见减少33.3%,95%置信区间26.6% - 40%);精度偏差降低了7.50×10 - 2,95%置信区间7.48×10 - 2至7.51×10 - 2;偏见减少94.6%,95%置信区间94.5% - 94.7%;错误发现率降低了94.5%;绝对偏见减少 = 3.50×10 - 2,95%置信区间3.49×10 - 2至3.50×10 - 2)。同样,在国家健康与营养检查调查中,AEquity指导的数据收集在死亡率预测方面显示偏见减少高达80%(偏见减少绝对值 = 0.08,95%置信区间0.07 - 0.09)。然后,我们想将AEquity与诸如平衡经验风险最小化和校准等先进的数据引导去偏措施进行比较。因此,我们以平衡经验风险最小化和校准为基准,表明AEquity指导的数据收集优于这两种标准方法。此外,我们证明AEquity适用于全连接网络;卷积神经网络,如ResNet - 50;变压器架构,如VIT - B - 16,一个有8600万个参数的视觉变压器;以及非参数方法,如轻梯度提升机。

结论

简而言之,我们通过将AEquity应用于不同的数据集、算法和交叉分析,并就一系列传统公平性指标衡量其有效性,证明了AEquity是一个强大的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a2f/12410029/0af9e8e68f87/jmir-v27-e71757-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验