School of Biomedical Engineering, Capital Medical University, No.10, Xitoutiao, You An Men, Fengtai District, Beijing 100069, China; Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, No.10, Xitoutiao, You An Men, Fengtai District, Beijing 100069, China.
Information Center, Xuanwu Hospital, Capital Medical University, No.45 Changchun Street, Xicheng District, Beijing 100053, China.
J Biomed Inform. 2023 Jul;143:104427. doi: 10.1016/j.jbi.2023.104427. Epub 2023 Jun 18.
To represent a patient record with both time-invariant and time-varying features as a single vector using an end-to-end deep learning model, and further to predict the kidney failure (KF) status and mortality of heart failure (HF) patients.
The time-invariant EMR data included demographic information and comorbidities, and the time-varying EMR data were lab tests. We used a Transformer encoder module to represent the time-invariant data, and refined a long short-term memory (LSTM) with a Transformer encoder attached to the top to represent the time-varying data, taking the original measured values and their corresponding embedding vectors, masking vectors, and two types of time intervals as inputs. The proposed representations of patients with time-invariant and time-varying data were used to predict KF status (949 out of 5268 HF patients diagnosed with KF) and mortality (463 in-hospital deaths) for HF patients. Comparative experiments were conducted between the proposed model and some representative machine learning models. Ablation experiments were also performed around the time-varying data representation, including replacing the refined LSTM with the standard LSTM, GRU-D and T-LSTM, respectively, and removing the Transformer encoder and the time-varying data representation module, respectively. The visualization of the attention weights of the time-invariant and time-varying features was used to clinically interpret the predictive performance. We used the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), and the F1-score to evaluate the predictive performance of the models.
The proposed model achieved superior performance, with average AUROCs, AUPRCs and F1-scores of 0.960, 0.610 and 0.759 for KF prediction and 0.937, 0.353 and 0.537 for mortality prediction, respectively. Predictive performance improved with the addition of time-varying data from longer time periods. The proposed model outperformed the comparison and ablation references in both prediction tasks.
Both time-invariant and time-varying EMR data of patients could be efficiently represented by the proposed unified deep learning model, which shows higher performance in clinical prediction tasks. The way to use time-varying data in the current study is hopeful to be used in other kinds of time-varying data and other clinical tasks.
使用端到端深度学习模型,将具有时不变和时变特征的患者记录表示为单个向量,并进一步预测心力衰竭(HF)患者的肾衰竭(KF)状态和死亡率。
时不变电子病历数据包括人口统计学信息和合并症,时变电子病历数据为实验室检查。我们使用 Transformer 编码器模块表示时不变数据,并在顶部附加一个经过改进的长短期记忆(LSTM)来表示时变数据,输入包括原始测量值及其对应的嵌入向量、屏蔽向量和两种时间间隔。使用患者的时不变和时变数据表示来预测 HF 患者的 KF 状态(5268 例 HF 患者中有 949 例诊断为 KF)和死亡率(住院期间 463 例死亡)。在提出的模型和一些代表性机器学习模型之间进行了对比实验。还围绕时变数据表示进行了消融实验,包括分别用标准 LSTM、GRU-D 和 T-LSTM 替换改进的 LSTM,以及分别删除 Transformer 编码器和时变数据表示模块。时不变和时变特征的注意力权重可视化用于临床解释预测性能。我们使用接收者操作特征曲线下的面积(AUROC)、精度-召回曲线下的面积(AUPRC)和 F1 分数来评估模型的预测性能。
提出的模型表现出色,KF 预测的平均 AUROC、AUPRC 和 F1 分数分别为 0.960、0.610 和 0.759,死亡率预测分别为 0.937、0.353 和 0.537。随着更长时间段的时变数据的加入,预测性能得到提高。在这两个预测任务中,提出的模型均优于比较和消融参考。
提出的统一深度学习模型可以有效地表示患者的时不变和时变电子病历数据,在临床预测任务中表现出更高的性能。本研究中使用时变数据的方式有望应用于其他类型的时变数据和其他临床任务。