Suppr超能文献

医学影像领域专家级视觉语言基础模型的人口统计学偏差

Demographic bias of expert-level vision-language foundation models in medical imaging.

作者信息

Yang Yuzhe, Liu Yujia, Liu Xin, Gulhane Avanti, Mastrodicasa Domenico, Wu Wei, Wang Edward J, Sahani Dushyant, Patel Shwetak

机构信息

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA.

Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA, USA.

出版信息

Sci Adv. 2025 Mar 28;11(13):eadq0305. doi: 10.1126/sciadv.adq0305. Epub 2025 Mar 26.

Abstract

Advances in artificial intelligence (AI) have achieved expert-level performance in medical imaging applications. Notably, self-supervised vision-language foundation models can detect a broad spectrum of pathologies without relying on explicit training annotations. However, it is crucial to ensure that these AI models do not mirror or amplify human biases, disadvantaging historically marginalized groups such as females or Black patients. In this study, we investigate the algorithmic fairness of state-of-the-art vision-language foundation models in chest x-ray diagnosis across five globally sourced datasets. Our findings reveal that compared to board-certified radiologists, these foundation models consistently underdiagnose marginalized groups, with even higher rates seen in intersectional subgroups such as Black female patients. Such biases present over a wide range of pathologies and demographic attributes. Further analysis of the model embedding uncovers its substantial encoding of demographic information. Deploying medical AI systems with biases can intensify preexisting care disparities, posing potential challenges to equitable healthcare access and raising ethical questions about their clinical applications.

摘要

人工智能(AI)的进展已在医学成像应用中达到了专家级水平。值得注意的是,自监督视觉语言基础模型无需依赖明确的训练标注就能检测出广泛的病症。然而,至关重要的是要确保这些人工智能模型不会反映或放大人类偏见,使女性或黑人患者等历史上处于边缘地位的群体处于不利地位。在本研究中,我们调查了跨五个全球来源数据集的胸部X光诊断中最先进的视觉语言基础模型的算法公平性。我们的研究结果表明,与获得委员会认证的放射科医生相比,这些基础模型始终对边缘化群体诊断不足,在黑人女性患者等交叉亚组中比例更高。这种偏见存在于广泛的病症和人口属性中。对模型嵌入的进一步分析揭示了其对人口信息的大量编码。部署存在偏见的医学人工智能系统可能会加剧现有的医疗差距,给公平获得医疗保健带来潜在挑战,并引发有关其临床应用的伦理问题。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验