Suppr超能文献

基于 ICD 编码和人口统计学数据的深度注意力模型,可在入院时预测住院时间和院内死亡率。

A deep attention model to forecast the Length Of Stay and the in-hospital mortality right on admission from ICD codes and demographic data.

机构信息

Department of Computer Science, Sangmyung University, Seoul, Republic of Korea.

Graduate School of Information, Yonsei University, Seoul, Republic of Korea.

出版信息

J Biomed Inform. 2021 Jun;118:103778. doi: 10.1016/j.jbi.2021.103778. Epub 2021 Apr 17.

Abstract

Leveraging the Electronic Health Records (EHR) longitudinal data to produce actionable clinical insights has always been a critical issue for recent studies. Non-forecasted extended hospitalizations account for a disproportionate amount of resource use, the mediocre quality of inpatient care, and avoidable fatalities. The capability to predict the Length of Stay (LoS) and mortality in the early stages of the admission provides opportunities to improve care and prevent many preventable losses. Forecasting the in-hospital mortality is important in providing clinicians with enough insights to make decisions and hospitals to allocate resources, hence predicting the LoS and mortality within the first day of admission is a difficult but a paramount endeavor. The biggest challenge is that few data are available by this time, thus the prediction has to bring in the previous admissions history and free text diagnosis that are recorded immediately on admission. We propose a model that uses the multi-modal EHR structured medical codes and key demographic information to classify the LoS in 3 classes; Short Los (LoS⩽10 days), Medium LoS (10<LoS⩽30 days) and Long LoS (LoS>30 days) as well as mortality as a binary classification of a patient's death during current admission. The prediction has to use data available only within 24 h of admission. The key predictors include previous ICD9 diagnosis codes, ICD9 procedures, key demographic data, and free text diagnosis of the current admission recorded right on admission. We propose a Hierarchical Attention Network (HAN-LoS and HAN-Mor) model and train it to a dataset of over 45321 admissions recorded in the de-identified MIMIC-III dataset. For improved prediction, our attention mechanisms can focus on the most influential past admissions and most influential codes in these admissions. For fair performance evaluation, we implemented and compared the HAN model with previous approaches. With dataset balancing techniques HAN-LoS achieved an AUROC of over 0.82 and a Micro-F1 score of 0.24 and HAN-Mor achieved AUC-ROC of 0.87 hence outperforming the existing baselines that use structured medical codes as well as clinical time series for LoS and Mortality forecasting. By predicting mortality and LoS using the same model, we show that with little tuning the proposed model can be used for other clinical predictive tasks like phenotyping, decompensation,re-admission prediction, and survival analysis.

摘要

利用电子健康记录(EHR)的纵向数据来生成可操作的临床见解一直是近期研究的关键问题。未预测到的延长住院时间占资源使用的不成比例,住院护理质量一般,以及可避免的死亡。在入院早期预测住院时间(LoS)和死亡率的能力为改善护理和预防许多可预防的损失提供了机会。预测住院期间的死亡率对于为临床医生提供足够的决策洞察力和医院分配资源非常重要,因此,在入院的第一天内预测 LoS 和死亡率是一项困难但至关重要的任务。最大的挑战是,此时可用的数据很少,因此预测必须引入之前的入院记录和入院时立即记录的自由文本诊断。我们提出了一种使用多模态 EHR 结构化医疗代码和关键人口统计学信息来对 3 类 LoS 进行分类的模型;短 LoS(LoS ⩽10 天)、中 LoS(10<LoS ⩽30 天)和长 LoS(LoS>30 天)以及死亡率作为患者当前入院期间死亡的二元分类。预测必须仅使用入院后 24 小时内可用的数据。关键预测因素包括之前的 ICD9 诊断代码、ICD9 程序、关键人口统计学数据以及入院时立即记录的当前入院的自由文本诊断。我们提出了一种层次注意网络(HAN-LoS 和 HAN-Mor)模型,并将其训练到 MIMIC-III 数据集记录的超过 45321 次入院的数据集上。为了提高预测性能,我们的注意力机制可以关注最有影响力的过去入院和这些入院中最有影响力的代码。为了进行公平的性能评估,我们实现并比较了 HAN 模型与之前的方法。通过使用数据集平衡技术,HAN-LoS 实现了超过 0.82 的 AUROC 和 0.24 的 Micro-F1 分数,HAN-Mor 实现了 AUC-ROC 为 0.87,因此优于使用结构化医疗代码以及 LoS 和死亡率预测的临床时间序列的现有基线。通过使用相同的模型预测死亡率和 LoS,我们表明,通过进行少量调整,所提出的模型可用于其他临床预测任务,如表型分析、失代偿、再入院预测和生存分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验