• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估 30 天内医院再入院模型中的算法偏差:回顾性分析。

Evaluating Algorithmic Bias in 30-Day Hospital Readmission Models: Retrospective Analysis.

机构信息

Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States.

Johns Hopkins Center for Population Health Information Technology, Baltimore, MD, United States.

出版信息

J Med Internet Res. 2024 Apr 18;26:e47125. doi: 10.2196/47125.

DOI:10.2196/47125
PMID:38422347
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11066744/
Abstract

BACKGROUND

The adoption of predictive algorithms in health care comes with the potential for algorithmic bias, which could exacerbate existing disparities. Fairness metrics have been proposed to measure algorithmic bias, but their application to real-world tasks is limited.

OBJECTIVE

This study aims to evaluate the algorithmic bias associated with the application of common 30-day hospital readmission models and assess the usefulness and interpretability of selected fairness metrics.

METHODS

We used 10.6 million adult inpatient discharges from Maryland and Florida from 2016 to 2019 in this retrospective study. Models predicting 30-day hospital readmissions were evaluated: LACE Index, modified HOSPITAL score, and modified Centers for Medicare & Medicaid Services (CMS) readmission measure, which were applied as-is (using existing coefficients) and retrained (recalibrated with 50% of the data). Predictive performances and bias measures were evaluated for all, between Black and White populations, and between low- and other-income groups. Bias measures included the parity of false negative rate (FNR), false positive rate (FPR), 0-1 loss, and generalized entropy index. Racial bias represented by FNR and FPR differences was stratified to explore shifts in algorithmic bias in different populations.

RESULTS

The retrained CMS model demonstrated the best predictive performance (area under the curve: 0.74 in Maryland and 0.68-0.70 in Florida), and the modified HOSPITAL score demonstrated the best calibration (Brier score: 0.16-0.19 in Maryland and 0.19-0.21 in Florida). Calibration was better in White (compared to Black) populations and other-income (compared to low-income) groups, and the area under the curve was higher or similar in the Black (compared to White) populations. The retrained CMS and modified HOSPITAL score had the lowest racial and income bias in Maryland. In Florida, both of these models overall had the lowest income bias and the modified HOSPITAL score showed the lowest racial bias. In both states, the White and higher-income populations showed a higher FNR, while the Black and low-income populations resulted in a higher FPR and a higher 0-1 loss. When stratified by hospital and population composition, these models demonstrated heterogeneous algorithmic bias in different contexts and populations.

CONCLUSIONS

Caution must be taken when interpreting fairness measures' face value. A higher FNR or FPR could potentially reflect missed opportunities or wasted resources, but these measures could also reflect health care use patterns and gaps in care. Simply relying on the statistical notions of bias could obscure or underplay the causes of health disparity. The imperfect health data, analytic frameworks, and the underlying health systems must be carefully considered. Fairness measures can serve as a useful routine assessment to detect disparate model performances but are insufficient to inform mechanisms or policy changes. However, such an assessment is an important first step toward data-driven improvement to address existing health disparities.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/c8f6e0e2c162/jmir_v26i1e47125_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/bfbaa084aca2/jmir_v26i1e47125_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/671b71741124/jmir_v26i1e47125_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/30146c610f8f/jmir_v26i1e47125_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/d30c1a8d483f/jmir_v26i1e47125_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/c8f6e0e2c162/jmir_v26i1e47125_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/bfbaa084aca2/jmir_v26i1e47125_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/671b71741124/jmir_v26i1e47125_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/30146c610f8f/jmir_v26i1e47125_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/d30c1a8d483f/jmir_v26i1e47125_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5b6/11066744/c8f6e0e2c162/jmir_v26i1e47125_fig5.jpg
摘要

背景

医疗保健中预测算法的采用存在算法偏差的可能性,这可能会加剧现有的差距。已经提出了公平性指标来衡量算法偏差,但它们在实际任务中的应用受到限制。

目的

本研究旨在评估常见 30 天医院再入院模型应用中的算法偏差,并评估选定公平性指标的有用性和可解释性。

方法

我们在这项回顾性研究中使用了来自马里兰州和佛罗里达州 2016 年至 2019 年的 1060 万成年住院患者出院数据。评估了预测 30 天医院再入院的模型:LACE 指数、改良 HOSPITAL 评分和改良医疗保险和医疗补助服务中心(CMS)再入院衡量标准,这些模型是按原样应用的(使用现有系数)和重新训练的(用 50%的数据重新校准)。评估了所有模型、黑人和白人之间以及低收入和其他收入群体之间的预测性能和偏差指标。偏差指标包括假阴性率(FNR)、假阳性率(FPR)、0-1 损失和广义熵指数的均等性。根据 FNR 和 FPR 差异分层种族偏差,以探索不同人群中算法偏差的变化。

结果

经重新训练的 CMS 模型表现出最佳的预测性能(马里兰州的曲线下面积为 0.74,佛罗里达州为 0.68-0.70),而改良 HOSPITAL 评分表现出最佳的校准(马里兰州的 Brier 分数为 0.16-0.19,佛罗里达州为 0.19-0.21)。白人(与黑人相比)和其他收入(与低收入相比)群体的校准效果更好,而黑人(与白人相比)群体的曲线下面积更高或相似。经重新训练的 CMS 和改良 HOSPITAL 评分在马里兰州的种族和收入偏差最小。在佛罗里达州,这两个模型总体上的收入偏差最小,改良 HOSPITAL 评分的种族偏差最小。在这两个州,白人高收入人群的 FNR 较高,而黑人低收入人群的 FPR 和 0-1 损失较高。按医院和人口构成分层时,这些模型在不同背景和人群中表现出了异质的算法偏差。

结论

在解释公平性指标的表面价值时必须谨慎。较高的 FNR 或 FPR 可能反映了错失的机会或浪费的资源,但这些指标也可能反映了医疗保健的使用模式和护理差距。仅仅依靠统计上的偏差概念可能会掩盖或淡化健康差异的原因。必须仔细考虑不完美的健康数据、分析框架和潜在的健康系统。公平性指标可以作为一种有用的常规评估手段来检测模型性能的差异,但不足以为机制或政策变化提供信息。然而,这种评估是朝着以数据为导向的改进方向迈出的重要一步,以解决现有的健康差距问题。

相似文献

1
Evaluating Algorithmic Bias in 30-Day Hospital Readmission Models: Retrospective Analysis.评估 30 天内医院再入院模型中的算法偏差:回顾性分析。
J Med Internet Res. 2024 Apr 18;26:e47125. doi: 10.2196/47125.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果:来自系统评价和意大利医院数据评估的证据]
Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.
4
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
5
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
6
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
7
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
8
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
9
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
10
AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study.基于人工智能的心脏CT衰减扫描检测肝脂肪变性及综合肝脏评估可增强全因死亡风险分层:一项多中心研究
medRxiv. 2025 Jun 11:2025.06.09.25329157. doi: 10.1101/2025.06.09.25329157.

引用本文的文献

1
The ethics of data mining in healthcare: challenges, frameworks, and future directions.医疗保健领域数据挖掘的伦理问题:挑战、框架及未来方向。
BioData Min. 2025 Jul 11;18(1):47. doi: 10.1186/s13040-025-00461-w.
2
Predicting spatio-temporal dynamics of dengue using INLA (integrated nested laplace approximation) in Yogyakarta, Indonesia.在印度尼西亚日惹使用集成嵌套拉普拉斯近似法(INLA)预测登革热的时空动态。
BMC Public Health. 2025 Apr 8;25(1):1321. doi: 10.1186/s12889-025-22545-2.
3
Development and validation of a risk prediction model for 30-day readmission in elderly type 2 diabetes patients complicated with heart failure: a multicenter, retrospective study.

本文引用的文献

1
Algorithmic fairness in computational medicine.计算医学中的算法公平性。
EBioMedicine. 2022 Oct;84:104250. doi: 10.1016/j.ebiom.2022.104250. Epub 2022 Sep 6.
2
A bias evaluation checklist for predictive models and its pilot application for 30-day hospital readmission models.预测模型的偏倚评估清单及其在 30 天住院再入院模型中的初步应用。
J Am Med Inform Assoc. 2022 Jul 12;29(8):1323-1333. doi: 10.1093/jamia/ocac065.
3
Assessing socioeconomic bias in machine learning algorithms in health care: a case study of the HOUSES index.
老年2型糖尿病合并心力衰竭患者30天再入院风险预测模型的开发与验证:一项多中心回顾性研究
Front Endocrinol (Lausanne). 2025 Feb 27;16:1534516. doi: 10.3389/fendo.2025.1534516. eCollection 2025.
评估医疗保健中机器学习算法的社会经济偏差:以 HOUSES 指数为例。
J Am Med Inform Assoc. 2022 Jun 14;29(7):1142-1151. doi: 10.1093/jamia/ocac052.
4
Framework for Integrating Equity Into Machine Learning Models: A Case Study.将公平性融入机器学习模型的框架:一个案例研究。
Chest. 2022 Jun;161(6):1621-1627. doi: 10.1016/j.chest.2022.02.001. Epub 2022 Feb 7.
5
In medicine, how do we machine learn anything real?在医学领域,我们如何通过机器学习获得任何实际成果?
Patterns (N Y). 2022 Jan 14;3(1):100392. doi: 10.1016/j.patter.2021.100392.
6
The Promise for Reducing Healthcare Cost with Predictive Model: An Analysis with Quantized Evaluation Metric on Readmission.基于再入院量化评估指标的预测模型降低医疗成本的承诺:分析。
J Healthc Eng. 2021 Nov 2;2021:9208138. doi: 10.1155/2021/9208138. eCollection 2021.
7
Ethical Machine Learning in Healthcare.医疗保健中的伦理机器学习。
Annu Rev Biomed Data Sci. 2021 Jul;4:123-144. doi: 10.1146/annurev-biodatasci-092820-114757. Epub 2021 May 6.
8
Application of machine learning in predicting hospital readmissions: a scoping review of the literature.机器学习在预测医院再入院中的应用:文献综述。
BMC Med Res Methodol. 2021 May 6;21(1):96. doi: 10.1186/s12874-021-01284-z.
9
Equity in essence: a call for operationalising fairness in machine learning for healthcare.公平的本质:呼吁在医疗保健机器学习中实现公平性操作化。
BMJ Health Care Inform. 2021 Apr;28(1). doi: 10.1136/bmjhci-2020-100289.
10
Racial Disparities in the Use of Surgical Procedures in the US.美国外科手术应用中的种族差异。
JAMA Surg. 2021 Mar 1;156(3):274-281. doi: 10.1001/jamasurg.2020.6257.