Suppr超能文献

保留信息性存在:缺失数据和插补策略如何影响基于人工智能的早期预警评分的性能

Preserving Informative Presence: How Missing Data and Imputation Strategies Affect the Performance of an AI-Based Early Warning Score.

作者信息

Sim Taeyong, Hahn Sangchul, Kim Kwang-Joon, Cho Eun-Young, Jeong Yeeun, Kim Ji-Hyun, Ha Eun-Yeong, Kim In-Cheol, Park Sun-Hyo, Cho Chi-Heum, Yu Gyeong-Im, Cho Hochan, Lee Ki-Byung

机构信息

AITRICS Corp., Seoul 06221, Republic of Korea.

Division of Geriatrics, Department of Internal Medicine, Yonsei University College of Medicine, Seoul 03722, Republic of Korea.

出版信息

J Clin Med. 2025 Mar 24;14(7):2213. doi: 10.3390/jcm14072213.

Abstract

Data availability can affect the performance of AI-based early warning scores (EWSs). This study evaluated how the extent of missing data and imputation strategies influence the predictive performance of the VitalCare-Major Adverse Event Score (VC-MAES), an AI-based EWS that uses last observation carried forward and normal-value imputation for missing values, to forecast clinical deterioration events, including unplanned ICU transfers, cardiac arrests, or death, up to 6 h in advance. We analyzed real-world data from 6039 patient encounters at Keimyung University Dongsan Hospital, Republic of Korea. Performance was evaluated under three scenarios: (1) using only vital signs and age, treating all other variables as missing; (2) reintroducing a full set of real-world clinical variables; and (3) imputing missing values drawn from a distribution within one standard deviation of the observed mean or using Multiple Imputation by Chained Equations (MICE). VC-MAES achieved the area under the receiver operating characteristic curve (AUROC) of 0.896 using only vital signs and age, outperforming traditional EWSs, including the National Early Warning Score (0.797) and the Modified Early Warning Score (0.722). Reintroducing full clinical variables improved the AUROC to 0.918, whereas mean-based imputation or MICE decreased the performance to 0.885 and 0.827, respectively. VC-MAES demonstrates robust predictive performance with limited inputs, outperforming traditional EWSs. Incorporating actual clinical data significantly improved accuracy. In contrast, mean-based or MICE imputation yielded poorer results than the default normal-value imputation, potentially due to disregarding the "informative presence" embedded in missing data patterns. These findings underscore the importance of understanding missingness patterns and employing imputation strategies that consider the decision-making context behind data availability to enhance model reliability.

摘要

数据可用性会影响基于人工智能的早期预警评分(EWS)的性能。本研究评估了缺失数据的程度和插补策略如何影响VitalCare-重大不良事件评分(VC-MAES)的预测性能,VC-MAES是一种基于人工智能的EWS,它使用末次观察值结转和正常值插补来处理缺失值,以提前6小时预测临床恶化事件,包括非计划的重症监护病房(ICU)转运、心脏骤停或死亡。我们分析了韩国庆熙大学东山医院6039例患者的真实世界数据。在三种情况下评估了性能:(1)仅使用生命体征和年龄,将所有其他变量视为缺失;(2)重新引入一整套真实世界的临床变量;(3)插补从观察均值的一个标准差范围内的分布中抽取的缺失值,或使用链式方程多重插补(MICE)。仅使用生命体征和年龄时,VC-MAES的受试者工作特征曲线下面积(AUROC)达到0.896,优于传统的EWS,包括国家早期预警评分(0.797)和改良早期预警评分(0.722)。重新引入完整的临床变量将AUROC提高到0.918,而基于均值的插补或MICE分别将性能降低到0.885和0.827。VC-MAES在输入有限的情况下表现出强大的预测性能,优于传统的EWS。纳入实际临床数据显著提高了准确性。相比之下,基于均值或MICE的插补产生的结果比默认的正常值插补更差,这可能是由于忽略了缺失数据模式中隐含的“信息性存在”。这些发现强调了理解缺失模式以及采用考虑数据可用性背后决策背景的插补策略以提高模型可靠性的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/11b6/11989256/ab2da3e86d22/jcm-14-02213-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验