调查医疗保健算法中的偏差：对肝脏疾病预测中监督机器学习模型的性别分层分析。

Investigating for bias in healthcare algorithms: a sex-stratified analysis of supervised machine learning models in liver disease prediction.

机构信息

Institute of Health Informatics, University College London, London, UK

Institute of Health Informatics, University College London, London, UK.

出版信息

BMJ Health Care Inform. 2022 Apr;29(1). doi: 10.1136/bmjhci-2021-100457.

DOI:10.1136/bmjhci-2021-100457

PMID:35470133

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9039354/

Abstract

OBJECTIVES

The Indian Liver Patient Dataset (ILPD) is used extensively to create algorithms that predict liver disease. Given the existing research describing demographic inequities in liver disease diagnosis and management, these algorithms require scrutiny for potential biases. We address this overlooked issue by investigating ILPD models for sex bias.

METHODS

Following our literature review of ILPD papers, the models reported in existing studies are recreated and then interrogated for bias. We define four experiments, training on sex-unbalanced/balanced data, with and without feature selection. We build random forests (RFs), support vector machines (SVMs), Gaussian Naïve Bayes and logistic regression (LR) classifiers, running experiments 100 times, reporting average results with SD.

RESULTS

We reproduce published models achieving accuracies of >70% (LR 71.31% (2.37 SD) - SVM 79.40% (2.50 SD)) and demonstrate a previously unobserved performance disparity. Across all classifiers females suffer from a higher false negative rate (FNR). Presently, RF and LR classifiers are reported as the most effective models, yet in our experiments they demonstrate the greatest FNR disparity (RF; -21.02%; LR; -24.07%).

DISCUSSION

We demonstrate a sex disparity that exists in published ILPD classifiers. In practice, the higher FNR for females would manifest as increased rates of missed diagnosis for female patients and a consequent lack of appropriate care. Our study demonstrates that evaluating biases in the initial stages of machine learning can provide insights into inequalities in current clinical practice, reveal pathophysiological differences between the male and females, and can mitigate the digitisation of inequalities into algorithmic systems.

CONCLUSION

Our findings are important to medical data scientists, clinicians and policy-makers involved in the implementation medical artificial intelligence systems. An awareness of the potential biases of these systems is essential in preventing the digital exacerbation of healthcare inequalities.

摘要

目的

印度肝病患者数据集（ILPD）被广泛用于创建预测肝病的算法。鉴于现有的研究描述了肝病诊断和管理方面的人口统计学差异，这些算法需要仔细检查是否存在潜在偏差。我们通过研究 ILPD 模型中的性别偏差来解决这个被忽视的问题。

方法

在对 ILPD 论文进行文献回顾后，我们重新创建了现有研究中报告的模型，并对其进行了偏差检测。我们定义了四个实验，在性别不平衡/平衡数据上进行训练，并带有/不带有特征选择。我们构建了随机森林（RF）、支持向量机（SVM）、高斯朴素贝叶斯和逻辑回归（LR）分类器，进行了 100 次实验，报告平均结果及其标准差。

结果

我们复制了发表的模型，其准确率超过 70%（LR 为 71.31%（2.37 标准差）-SVM 为 79.40%（2.50 标准差）），并展示了一个以前未观察到的性能差异。在所有分类器中，女性的假阴性率（FNR）更高。目前，RF 和 LR 分类器被报告为最有效的模型，但在我们的实验中，它们表现出最大的 FNR 差异（RF：-21.02%；LR：-24.07%）。

讨论

我们展示了发表的 ILPD 分类器中存在的性别差异。在实践中，女性的 FNR 较高将表现为女性患者的漏诊率增加，以及相应的缺乏适当护理。我们的研究表明，在机器学习的初始阶段评估偏差可以深入了解当前临床实践中的不平等现象，揭示男性和女性之间的生理差异，并可以减轻不平等现象在算法系统中的数字化。

结论

我们的研究结果对参与实施医疗人工智能系统的医学数据科学家、临床医生和政策制定者非常重要。了解这些系统的潜在偏差对于防止医疗保健不平等现象的数字化加剧至关重要。

相似文献

Investigating for bias in healthcare algorithms: a sex-stratified analysis of supervised machine learning models in liver disease prediction.调查医疗保健算法中的偏差：对肝脏疾病预测中监督机器学习模型的性别分层分析。

BMJ Health Care Inform. 2022 Apr;29(1). doi: 10.1136/bmjhci-2021-100457.

Sex-Based Performance Disparities in Machine Learning Algorithms for Cardiac Disease Prediction: Exploratory Study.基于性别的机器学习算法在心脏疾病预测中的表现差异：探索性研究。

J Med Internet Res. 2024 Aug 26;26:e46936. doi: 10.2196/46936.

Prediction and feature selection of low birth weight using machine learning algorithms.利用机器学习算法预测和选择低出生体重。

J Health Popul Nutr. 2024 Oct 12;43(1):157. doi: 10.1186/s41043-024-00647-8.

Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021.2021 年，埃塞俄比亚东北部阿法尔地区使用监督机器学习算法对 2 型糖尿病疾病状况进行分类和预测。

Sci Rep. 2023 May 13;13(1):7779. doi: 10.1038/s41598-023-34906-1.

Artificial intelligence based system for predicting permanent stoma after sphincter saving operations.基于人工智能的系统，用于预测保肛手术后的永久性造口。

Sci Rep. 2023 Sep 25;13(1):16039. doi: 10.1038/s41598-023-43211-w.

Responsible AI for cardiovascular disease detection: Towards a privacy-preserving and interpretable model.心血管疾病检测的负责任 AI：迈向隐私保护和可解释的模型。

Comput Methods Programs Biomed. 2024 Sep;254:108289. doi: 10.1016/j.cmpb.2024.108289. Epub 2024 Jun 17.

Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers.机器学习在前列腺癌病理分期中的应用：一系列分类器的性能比较。

Artif Intell Med. 2012 May;55(1):25-35. doi: 10.1016/j.artmed.2011.11.003. Epub 2011 Dec 27.

Shallow and deep learning classifiers in medical image analysis.医学图像分析中的浅层和深度学习分类器。

Eur Radiol Exp. 2024 Mar 5;8(1):26. doi: 10.1186/s41747-024-00428-2.

Application of machine learning algorithms in predicting HIV infection among men who have sex with men: Model development and validation.机器学习算法在预测男男性行为者中 HIV 感染中的应用：模型开发和验证。

Front Public Health. 2022 Aug 25;10:967681. doi: 10.3389/fpubh.2022.967681. eCollection 2022.

Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers.医学系统评价中筛查非随机研究：分类器的比较研究。

Artif Intell Med. 2012 Jul;55(3):197-207. doi: 10.1016/j.artmed.2012.05.002. Epub 2012 Jun 5.

引用本文的文献

Detecting, Characterizing, and Mitigating Implicit and Explicit Racial Biases in Health Care Datasets With Subgroup Learnability: Algorithm Development and Validation Study.利用亚组可学习性检测、表征和减轻医疗保健数据集中的隐性和显性种族偏见：算法开发与验证研究

J Med Internet Res. 2025 Sep 4;27:e71757. doi: 10.2196/71757.

Optimizing ensemble machine learning models for accurate liver disease prediction in healthcare.优化集成机器学习模型以实现医疗保健中肝病的准确预测。

PLoS One. 2025 Aug 28;20(8):e0330899. doi: 10.1371/journal.pone.0330899. eCollection 2025.

Mitigated deployment strategy for ethical AI in clinical settings.临床环境中符合伦理的人工智能的缓解部署策略。

BMJ Health Care Inform. 2025 Jul 13;32(1):e101363. doi: 10.1136/bmjhci-2024-101363.

A supervised machine learning approach with feature selection for sex-specific biomarker prediction.一种用于性别特异性生物标志物预测的带特征选择的监督式机器学习方法。

NPJ Syst Biol Appl. 2025 Jul 1;11(1):69. doi: 10.1038/s41540-025-00523-z.

The Limitations of Artificial Intelligence in Head and Neck Oncology.人工智能在头颈肿瘤学中的局限性

Adv Ther. 2025 Jun;42(6):2559-2568. doi: 10.1007/s12325-025-03198-4. Epub 2025 Apr 29.

Public Awareness of and Attitudes Toward the Use of AI in Pathology Research and Practice: Mixed Methods Study.公众对人工智能在病理学研究与实践中应用的认知及态度：混合方法研究

J Med Internet Res. 2025 Apr 2;27:e59591. doi: 10.2196/59591.

Sex bias consideration in healthcare machine-learning research: a systematic review in rheumatoid arthritis.医疗保健机器学习研究中的性别偏见考量：类风湿关节炎的系统评价

BMJ Open. 2025 Mar 13;15(3):e086117. doi: 10.1136/bmjopen-2024-086117.

From prediction to practice: mitigating bias and data shift in machine-learning models for chemotherapy-induced organ dysfunction across unseen cancers.从预测到实践：减轻机器学习模型中针对未知癌症化疗引起的器官功能障碍的偏差和数据偏移。

BMJ Oncol. 2024 Nov 2;3(1):e000430. doi: 10.1136/bmjonc-2024-000430. eCollection 2024.

Bias Mitigation in Primary Health Care Artificial Intelligence Models: Scoping Review.初级卫生保健人工智能模型中的偏差缓解：范围综述

J Med Internet Res. 2025 Jan 7;27:e60269. doi: 10.2196/60269.

Big data and AI for gender equality in health: bias is a big challenge.利用大数据和人工智能促进健康领域的性别平等：偏见是一项重大挑战。

Front Big Data. 2024 Oct 16;7:1436019. doi: 10.3389/fdata.2024.1436019. eCollection 2024.

本文引用的文献

Health inequities and the inappropriate use of race in nephrology.健康不公平和肾脏病学中种族的不当使用。

Nat Rev Nephrol. 2022 Feb;18(2):84-94. doi: 10.1038/s41581-021-00501-8. Epub 2021 Nov 8.

Artificial Intelligence in mental health and the biases of language based models.人工智能在精神健康领域和基于语言的模型的偏见。

PLoS One. 2020 Dec 17;15(12):e0240376. doi: 10.1371/journal.pone.0240376. eCollection 2020.

The automation of bias in medical Artificial Intelligence (AI): Decoding the past to create a better future.医学人工智能（AI）中的偏见自动化：解码过去，创造更美好的未来。

Artif Intell Med. 2020 Nov;110:101965. doi: 10.1016/j.artmed.2020.101965. Epub 2020 Oct 6.

Hidden in Plain Sight - Reconsidering the Use of Race Correction in Clinical Algorithms.隐匿于众目睽睽之下——重新审视临床算法中种族校正的应用

N Engl J Med. 2020 Aug 27;383(9):874-882. doi: 10.1056/NEJMms2004740. Epub 2020 Jun 17.

Black Kidney Function Matters: Use or Misuse of Race?黑人肾功能问题：种族的使用还是误用？

JAMA. 2020 Aug 25;324(8):737-738. doi: 10.1001/jama.2020.13378.

Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare.生物医学与医疗保健领域人工智能中的性别差异与偏见

NPJ Digit Med. 2020 Jun 1;3:81. doi: 10.1038/s41746-020-0288-5. eCollection 2020.

Eur J Heart Fail. 2020 May;22(5):775-788. doi: 10.1002/ejhf.1771. Epub 2020 Mar 27.

Machine learning in medicine: a practical introduction.医学中的机器学习：实用入门

BMC Med Res Methodol. 2019 Mar 19;19(1):64. doi: 10.1186/s12874-019-0681-4.

Establishment of age- and gender-specific pediatric reference intervals for liver function tests in healthy Han children.建立健康汉族儿童肝功能试验的年龄和性别特异性儿童参考区间。

World J Pediatr. 2018 Apr;14(2):151-159. doi: 10.1007/s12519-018-0126-x. Epub 2018 Mar 15.

A Review on the Sex Differences in Organ and System Pathology with Alcohol Drinking.饮酒导致的器官和系统病理学中的性别差异综述

Curr Drug Abuse Rev. 2016;9(2):87-92. doi: 10.2174/1874473710666170125151410.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验