Suppr超能文献

不同的审查与检测不足:临床机器学习中标签偏差的一个来源。

Disparate Censorship & Undertesting: A Source of Label Bias in Clinical Machine Learning.

作者信息

Chang Trenton, Sjoding Michael W, Wiens Jenna

机构信息

Division of Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA.

Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA.

出版信息

Proc Mach Learn Res. 2022 Aug;182:343-390.

Abstract

As machine learning (ML) models gain traction in clinical applications, understanding the impact of clinician and societal biases on ML models is increasingly important. While biases can arise in the labels used for model training, the many sources from which these biases arise are not yet well-studied. In this paper, we highlight (, differences in testing rates across patient groups) as a source of label bias that clinical ML models may amplify, potentially causing harm. Many patient risk-stratification models are trained using the results of clinician-ordered diagnostic and laboratory tests of labels. Patients without test results are often assigned a negative label, which assumes that untested patients do not experience the outcome. Since orders are affected by clinical and resource considerations, testing may not be uniform in patient populations, giving rise to disparate censorship. Disparate censorship in patients of equivalent risk leads to in certain groups, and in turn, more biased labels for such groups. Using such biased labels in standard ML pipelines could contribute to gaps in model performance across patient groups. Here, we theoretically and empirically characterize conditions in which disparate censorship or undertesting affect model performance across subgroups. Our findings call attention to disparate censorship as a source of label bias in clinical ML models.

摘要

随着机器学习(ML)模型在临床应用中越来越受到关注,了解临床医生和社会偏见对ML模型的影响变得越来越重要。虽然偏见可能出现在用于模型训练的标签中,但这些偏见产生的众多来源尚未得到充分研究。在本文中,我们强调(各患者群体检测率的差异)作为一种标签偏见来源,临床ML模型可能会放大这种偏见,从而可能造成危害。许多患者风险分层模型是使用临床医生开出的诊断和实验室检测结果作为标签进行训练的。没有检测结果的患者通常会被赋予一个负面标签,这意味着未接受检测的患者不会出现该结果。由于检测医嘱受到临床和资源因素的影响,不同患者群体的检测情况可能并不一致,从而导致不同程度的审查缺失。同等风险患者中的审查缺失差异会导致某些群体的(情况),进而导致这些群体的标签偏差更大。在标准的ML流程中使用这种有偏差的标签可能会导致不同患者群体之间模型性能的差距。在这里,我们从理论和实证角度描述了不同程度的审查缺失或检测不足影响各亚组模型性能的情况。我们的研究结果提醒人们注意,不同程度的审查缺失是临床ML模型中标签偏见的一个来源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0432/10162497/e8136285d355/nihms-1868579-f0013.jpg

相似文献

4
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
8
Diagnostic biases in translational bioinformatics.转化生物信息学中的诊断偏差。
BMC Med Genomics. 2015 Aug 1;8:46. doi: 10.1186/s12920-015-0116-y.
10
Weakly Semi-supervised phenotyping using Electronic Health records.基于电子健康记录的弱监督表型研究
J Biomed Inform. 2022 Oct;134:104175. doi: 10.1016/j.jbi.2022.104175. Epub 2022 Sep 5.

本文引用的文献

5
Structural Racism and Immigrant Health in the United States.美国的结构性种族主义与移民健康。
Health Educ Behav. 2021 Jun;48(3):332-341. doi: 10.1177/10901981211010676.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验