Suppr
超能文献

计算病理学模型导致的误诊中的人口统计学偏差。

Demographic bias in misdiagnosis by computational pathology models.

机构信息

Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.

Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.

出版信息

Nat Med. 2024 Apr;30(4):1174-1190. doi: 10.1038/s41591-024-02885-z. Epub 2024 Apr 19.

DOI:10.1038/s41591-024-02885-z

PMID:38641744

Abstract

Despite increasing numbers of regulatory approvals, deep learning-based computational pathology systems often overlook the impact of demographic factors on performance, potentially leading to biases. This concern is all the more important as computational pathology has leveraged large public datasets that underrepresent certain demographic groups. Using publicly available data from The Cancer Genome Atlas and the EBRAINS brain tumor atlas, as well as internal patient data, we show that whole-slide image classification models display marked performance disparities across different demographic groups when used to subtype breast and lung carcinomas and to predict IDH1 mutations in gliomas. For example, when using common modeling approaches, we observed performance gaps (in area under the receiver operating characteristic curve) between white and Black patients of 3.0% for breast cancer subtyping, 10.9% for lung cancer subtyping and 16.0% for IDH1 mutation prediction in gliomas. We found that richer feature representations obtained from self-supervised vision foundation models reduce performance variations between groups. These representations provide improvements upon weaker models even when those weaker models are combined with state-of-the-art bias mitigation strategies and modeling choices. Nevertheless, self-supervised vision foundation models do not fully eliminate these discrepancies, highlighting the continuing need for bias mitigation efforts in computational pathology. Finally, we demonstrate that our results extend to other demographic factors beyond patient race. Given these findings, we encourage regulatory and policy agencies to integrate demographic-stratified evaluation into their assessment guidelines.

摘要

尽管监管部门的批准越来越多，但基于深度学习的计算病理学系统常常忽略了人口统计学因素对性能的影响，这可能导致偏差。随着计算病理学利用了代表性不足的某些人口统计学群体的大型公共数据集，这种担忧变得更加重要。我们使用来自癌症基因组图谱和欧洲脑研究倡议大脑肿瘤图谱的公开数据以及内部患者数据，展示了全切片图像分类模型在用于乳腺癌和肺癌亚型分类以及预测胶质瘤中 IDH1 突变时，在不同人群中表现出明显的性能差异。例如，当使用常见的建模方法时，我们观察到乳腺癌亚型分类的白人和黑人患者之间的性能差距（接收者操作特征曲线下的面积）为 3.0%，肺癌亚型分类为 10.9%，胶质瘤中 IDH1 突变预测为 16.0%。我们发现，来自自我监督视觉基础模型的更丰富的特征表示减少了群体之间的性能变化。即使在将这些较弱的模型与最先进的偏差缓解策略和建模选择相结合的情况下，这些表示也可以提供改进。然而，自我监督视觉基础模型并没有完全消除这些差异，突出了在计算病理学中继续需要减轻偏差的努力。最后，我们证明我们的结果扩展到了患者种族以外的其他人口统计学因素。鉴于这些发现，我们鼓励监管和政策机构将人口统计学分层评估纳入其评估指南。

相似文献

Demographic bias in misdiagnosis by computational pathology models.

Nat Med. 2024 Apr;30(4):1174-1190. doi: 10.1038/s41591-024-02885-z. Epub 2024 Apr 19.

Evaluating machine learning model bias and racial disparities in non-small cell lung cancer using SEER registry data.

Health Care Manag Sci. 2024 Dec;27(4):631-649. doi: 10.1007/s10729-024-09691-6. Epub 2024 Nov 4.

Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology.

Med Image Anal. 2022 Jul;79:102474. doi: 10.1016/j.media.2022.102474. Epub 2022 May 4.

Deep learning-based IDH1 gene mutation prediction using histopathological imaging and clinical data.

Comput Biol Med. 2024 Sep;179:108902. doi: 10.1016/j.compbiomed.2024.108902. Epub 2024 Jul 21.

Masked hypergraph learning for weakly supervised histopathology whole slide image classification.

Comput Methods Programs Biomed. 2024 Aug;253:108237. doi: 10.1016/j.cmpb.2024.108237. Epub 2024 May 23.

Predictive Accuracy of Stroke Risk Prediction Models Across Black and White Race, Sex, and Age Groups.

JAMA. 2023 Jan 24;329(4):306-317. doi: 10.1001/jama.2022.24683.

Drop the shortcuts: image augmentation improves fairness and decreases AI detection of race and other demographics from medical images.

EBioMedicine. 2024 Apr;102:105047. doi: 10.1016/j.ebiom.2024.105047. Epub 2024 Mar 11.

Operational greenhouse-gas emissions of deep learning in digital pathology: a modelling study.

Lancet Digit Health. 2024 Jan;6(1):e58-e69. doi: 10.1016/S2589-7500(23)00219-4. Epub 2023 Nov 22.

Deep computational pathology in breast cancer.

Semin Cancer Biol. 2021 Jul;72:226-237. doi: 10.1016/j.semcancer.2020.08.006. Epub 2020 Aug 17.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

引用本文的文献

Digital and Artificial Intelligence-based Pathology: Not for Every Laboratory - A Mini-review on the Benefits and Pitfalls of Its Implementation.

J Clin Transl Pathol. 2025 Jun;5(2):79-85. doi: 10.14218/jctp.2025.00007. Epub 2025 Apr 3.

A multimodal dataset for precision oncology in head and neck cancer.

Nat Commun. 2025 Aug 4;16(1):7163. doi: 10.1038/s41467-025-62386-6.

Clinical Algorithms and the Legacy of Race-Based Correction: Historical Errors, Contemporary Revisions and Equity-Oriented Methodologies for Epidemiologists.

Clin Epidemiol. 2025 Jul 12;17:647-662. doi: 10.2147/CLEP.S527000. eCollection 2025.

Evaluating Vision and Pathology Foundation Models for Computational Pathology: A Comprehensive Benchmark Study.

Res Sq. 2025 Jul 4:rs.3.rs-6823810. doi: 10.21203/rs.3.rs-6823810/v1.

End-to-end prediction of clinical outcomes in head and neck squamous cell carcinoma with foundation model-based multiple instance learning.

BMC Artif Intell. 2025;1(1):3. doi: 10.1186/s44398-025-00003-8. Epub 2025 Jun 24.

A scoping review and evidence gap analysis of clinical AI fairness.

NPJ Digit Med. 2025 Jun 14;8(1):360. doi: 10.1038/s41746-025-01667-2.

PixCell: A generative foundation model for digital histopathology images.

ArXiv. 2025 Jun 5:arXiv:2506.05127v1.

Augmented reality microscopy to bridge trust between AI and pathologists.

NPJ Precis Oncol. 2025 May 12;9(1):139. doi: 10.1038/s41698-025-00899-5.

Artificial Intelligence in Placental Pathology: New Diagnostic Imaging Tools in Evolution and in Perspective.

J Imaging. 2025 Apr 3;11(4):110. doi: 10.3390/jimaging11040110.

NLP-enriched social determinants of health improve prediction of suicide death among the Veterans.

Res Sq. 2025 Mar 31:rs.3.rs-5067562. doi: 10.21203/rs.3.rs-5067562/v1.

本文引用的文献

Towards a general-purpose foundation model for computational pathology.

Nat Med. 2024 Mar;30(3):850-862. doi: 10.1038/s41591-024-02857-3. Epub 2024 Mar 19.

Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study.

Lancet Digit Health. 2024 Jan;6(1):e12-e22. doi: 10.1016/S2589-7500(23)00225-X.

Risk of Bias in Chest Radiography Deep Learning Foundation Models.

Radiol Artif Intell. 2023 Sep 27;5(6):e230060. doi: 10.1148/ryai.230060. eCollection 2023 Nov.

Prediction models for hormone receptor status in female breast cancer do not extend to males: further evidence of sex-based disparity in breast cancer.

NPJ Breast Cancer. 2023 Nov 8;9(1):91. doi: 10.1038/s41523-023-00599-y.

Algorithmic fairness and bias mitigation for clinical machine learning with deep reinforcement learning.

Nat Mach Intell. 2023;5(8):884-894. doi: 10.1038/s42256-023-00697-3. Epub 2023 Jul 31.

Implications of predicting race variables from medical images.

Science. 2023 Jul 14;381(6654):149-150. doi: 10.1126/science.adh4260. Epub 2023 Jul 13.

Algorithmic fairness in artificial intelligence for medicine and healthcare.

Nat Biomed Eng. 2023 Jun;7(6):719-742. doi: 10.1038/s41551-023-01056-8. Epub 2023 Jun 28.

Racial and Ethnic Bias in Risk Prediction Models for Colorectal Cancer Recurrence When Race and Ethnicity Are Omitted as Predictors.

JAMA Netw Open. 2023 Jun 1;6(6):e2318495. doi: 10.1001/jamanetworkopen.2023.18495.

Biased data, biased AI: deep networks predict the acquisition site of TCGA images.

Diagn Pathol. 2023 May 17;18(1):67. doi: 10.1186/s13000-023-01355-3.

Epidemiology and risk stratification of low-grade gliomas in the United States, 2004-2019: A competing-risk regression model for survival analysis.

Front Oncol. 2023 Mar 1;13:1079597. doi: 10.3389/fonc.2023.1079597. eCollection 2023.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

计算病理学模型导致的误诊中的人口统计学偏差。

Demographic bias in misdiagnosis by computational pathology models.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译