Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
Palliative and Advanced Illness Research (PAIR) Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
BMJ Qual Saf. 2023 Sep;32(9):503-516. doi: 10.1136/bmjqs-2022-015173. Epub 2023 Mar 31.
Evaluate predictive performance of an electronic health record (EHR)-based, inpatient 6-month mortality risk model developed to trigger palliative care consultation among patient groups stratified by age, race, ethnicity, insurance and socioeconomic status (SES), which may vary due to social forces (eg, racism) that shape health, healthcare and health data.
Retrospective evaluation of prediction model.
Three urban hospitals within a single health system.
All patients ≥18 years admitted between 1 January and 31 December 2017, excluding observation, obstetric, rehabilitation and hospice (n=58 464 encounters, 41 327 patients).
General performance metrics (c-statistic, integrated calibration index (ICI), Brier Score) and additional measures relevant to health equity (accuracy, false positive rate (FPR), false negative rate (FNR)).
For black versus non-Hispanic white patients, the model's accuracy was higher (0.051, 95% CI 0.044 to 0.059), FPR lower (-0.060, 95% CI -0.067 to -0.052) and FNR higher (0.049, 95% CI 0.023 to 0.078). A similar pattern was observed among patients who were Hispanic, younger, with Medicaid/missing insurance, or living in low SES zip codes. No consistent differences emerged in c-statistic, ICI or Brier Score. Younger age had the second-largest effect size in the mortality prediction model, and there were large standardised group differences in age (eg, 0.32 for non-Hispanic white versus black patients), suggesting age may contribute to systematic differences in the predicted probabilities between groups.
An EHR-based mortality risk model was less likely to identify some marginalised patients as potentially benefiting from palliative care, with younger age pinpointed as a possible mechanism. Evaluating predictive performance is a critical preliminary step in addressing algorithmic inequities in healthcare, which must also include evaluating clinical impact, and governance and regulatory structures for oversight, monitoring and accountability.
评估一种基于电子健康记录(EHR)的住院患者 6 个月死亡率风险模型的预测性能,该模型旨在根据年龄、种族、族裔、保险和社会经济状况(SES)对患者群体进行分层,触发姑息治疗咨询,这些分层可能因社会力量(例如种族主义)而有所不同,这些社会力量会影响健康、医疗保健和健康数据。
预测模型的回顾性评估。
单一医疗系统内的 3 家城市医院。
2017 年 1 月 1 日至 12 月 31 日期间入住的所有年龄≥18 岁的患者,排除观察、产科、康复和临终关怀(n=58464 次就诊,41327 名患者)。
一般性能指标(c 统计量、综合校准指数(ICI)、Brier 评分)和与健康公平相关的其他指标(准确性、假阳性率(FPR)、假阴性率(FNR))。
对于黑人和非西班牙裔白人患者,该模型的准确性更高(0.051,95%置信区间 0.044 至 0.059),FPR 更低(-0.060,95%置信区间-0.067 至-0.052),FNR 更高(0.049,95%置信区间 0.023 至 0.078)。在西班牙裔、年龄较小、拥有医疗补助/无保险或居住在 SES 较低邮政编码的患者中也观察到类似的模式。c 统计量、ICI 或 Brier 评分没有一致的差异。年龄是死亡率预测模型中第二大影响因素,并且年龄的组间差异很大(例如,非西班牙裔白人和黑人患者之间为 0.32),这表明年龄可能导致群体间预测概率存在系统差异。
基于 EHR 的死亡率风险模型不太可能识别出一些边缘患者可能受益于姑息治疗,而年龄较小则可能是一个潜在的机制。评估预测性能是解决医疗保健中算法不公平问题的关键初步步骤,还必须包括评估临床影响,以及治理和监管结构,以进行监督、监测和问责。