Section of Hospital Medicine, University of Chicago, Chicago, Illinois.
AgileMD, San Francisco, California.
JAMA Netw Open. 2024 Oct 1;7(10):e2438986. doi: 10.1001/jamanetworkopen.2024.38986.
Early warning decision support tools to identify clinical deterioration in the hospital are widely used, but there is little information on their comparative performance.
To compare 3 proprietary artificial intelligence (AI) early warning scores and 3 publicly available simple aggregated weighted scores.
DESIGN, SETTING, AND PARTICIPANTS: This retrospective cohort study was performed at 7 hospitals in the Yale New Haven Health System. All consecutive adult medical-surgical ward hospital encounters between March 9, 2019, and November 9, 2023, were included.
Simultaneous Epic Deterioration Index (EDI), Rothman Index (RI), eCARTv5 (eCART), Modified Early Warning Score (MEWS), National Early Warning Score (NEWS), and NEWS2 scores.
Clinical deterioration, defined as a transfer from ward to intensive care unit or death within 24 hours of an observation.
Of the 362 926 patient encounters (median patient age, 64 [IQR, 47-77] years; 200 642 [55.3%] female), 16 693 (4.6%) experienced a clinical deterioration event. eCART had the highest area under the receiver operating characteristic curve at 0.895 (95% CI, 0.891-0.900), followed by NEWS2 at 0.831 (95% CI, 0.826-0.836), NEWS at 0.829 (95% CI, 0.824-0.835), RI at 0.828 (95% CI, 0.823-0.834), EDI at 0.808 (95% CI, 0.802-0.812), and MEWS at 0.757 (95% CI, 0.750-0.764). After matching scores at the moderate-risk sensitivity level for a NEWS score of 5, overall positive predictive values (PPVs) ranged from a low of 6.3% (95% CI, 6.1%-6.4%) for an EDI score of 41 to a high of 17.3% (95% CI, 16.9%-17.8%) for an eCART score of 94. Matching scores at the high-risk specificity of a NEWS score of 7 yielded overall PPVs ranging from a low of 14.5% (95% CI, 14.0%-15.2%) for an EDI score of 54 to a high of 23.3% (95% CI, 22.7%-24.2%) for an eCART score of 97. The moderate-risk thresholds provided a median of at least 20 hours of lead time for all the scores. Median lead time at the high-risk threshold was 11 (IQR, 0-69) hours for eCART, 8 (IQR, 0-63) hours for NEWS, 6 (IQR, 0-62) hours for NEWS2, 5 (IQR, 0-56) hours for MEWS, 1 (IQR, 0-39) hour for EDI, and 0 (IQR, 0-42) hours for RI.
In this cohort study of inpatient encounters, eCART outperformed the other AI and non-AI scores, identifying more deteriorating patients with fewer false alarms and sufficient time to intervene. NEWS, a non-AI, publicly available early warning score, significantly outperformed EDI. Given the wide variation in accuracy, additional transparency and oversight of early warning tools may be warranted.
广泛使用早期预警决策支持工具来识别医院内的临床恶化情况,但关于其比较性能的信息很少。
比较 3 种专有的人工智能 (AI) 预警评分和 3 种公开的简单聚合加权评分。
设计、设置和参与者:这是一项回顾性队列研究,在耶鲁纽黑文卫生系统的 7 家医院进行。纳入了 2019 年 3 月 9 日至 2023 年 11 月 9 日期间所有连续的成人内科-外科病房住院患者。
同时进行的 Epic 恶化指数 (EDI)、Rothman 指数 (RI)、eCARTv5 (eCART)、改良早期预警评分 (MEWS)、国家早期预警评分 (NEWS) 和 NEWS2 评分。
临床恶化定义为观察后 24 小时内从病房转移到重症监护病房或死亡。
在 362926 例患者就诊中(患者年龄中位数,64 [IQR,47-77] 岁;200642 [55.3%] 为女性),16693 例(4.6%)发生临床恶化事件。eCART 的受试者工作特征曲线下面积最高,为 0.895(95%CI,0.891-0.900),其次是 NEWS2 为 0.831(95%CI,0.826-0.836)、NEWS 为 0.829(95%CI,0.824-0.835)、RI 为 0.828(95%CI,0.823-0.834)、EDI 为 0.808(95%CI,0.802-0.812)和 MEWS 为 0.757(95%CI,0.750-0.764)。在匹配 NEWS 评分为 5 分的中度风险敏感性水平的分数后,总体阳性预测值(PPV)范围从 EDI 评分为 41 的低值 6.3%(95%CI,6.1%-6.4%)到 eCART 评分为 94 的高值 17.3%(95%CI,16.9%-17.8%)。在匹配 NEWS 评分为 7 分的高风险特异性水平的分数后,总体阳性预测值(PPV)范围从 EDI 评分为 54 的低值 14.5%(95%CI,14.0%-15.2%)到 eCART 评分为 97 的高值 23.3%(95%CI,22.7%-24.2%)。中危阈值为所有评分提供了至少 20 小时的预警时间。高危阈值的中位预警时间为 eCART 为 11(IQR,0-69)小时,NEWS 为 8(IQR,0-63)小时,NEWS2 为 6(IQR,0-62)小时,MEWS 为 5(IQR,0-56)小时,EDI 为 1(IQR,0-39)小时,RI 为 0(IQR,0-42)小时。
在这项住院患者就诊的队列研究中,eCART 优于其他 AI 和非 AI 评分,通过更少的假警报和足够的干预时间识别出更多恶化的患者。非 AI、公开可用的早期预警评分 NEWS 显著优于 EDI。鉴于准确性的广泛差异,可能需要对早期预警工具进行额外的透明度和监督。