State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; University of Chinese Academy of Sciences, Beijing, 100049, China.
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; University of Chinese Academy of Sciences, Beijing, 100049, China.
Neural Netw. 2024 Dec;180:106672. doi: 10.1016/j.neunet.2024.106672. Epub 2024 Aug 29.
Over the past decades, massive Electronic Health Records (EHRs) have been accumulated in Intensive Care Unit (ICU) and many other healthcare scenarios. The rich and comprehensive information recorded presents an exceptional opportunity for patient outcome predictions. Nevertheless, due to the diversity of data modalities, EHRs exhibit a heterogeneous characteristic, raising a difficulty to organically leverage information from various modalities. It is an urgent need to capture the underlying correlations among different modalities. In this paper, we propose a novel framework named Multimodal Fusion Network (MFNet) for ICU patient outcome prediction. First, we incorporate multiple modality-specific encoders to learn different modality representations. Notably, a graph guided encoder is designed to capture underlying global relationships among medical codes, and a text encoder with pre-fine-tuning strategy is adopted to extract appropriate text representations. Second, we propose to pairwise merge multimodal representations with a tailored hierarchical fusion mechanism. The experiments conducted on the eICU-CRD dataset validate that MFNet achieves superior performance on mortality prediction and Length of Stay (LoS) prediction compared with various representative and state-of-the-art baselines. Moreover, comprehensive ablation study demonstrates the effectiveness of each component of MFNet.
在过去的几十年中,重症监护病房(ICU)和许多其他医疗场景中积累了大量的电子健康记录(EHR)。记录的丰富而全面的信息为患者预后预测提供了绝佳的机会。然而,由于数据模态的多样性,EHR 表现出异质的特征,难以从各种模态中有机地利用信息。迫切需要捕捉不同模态之间的潜在相关性。在本文中,我们提出了一种名为多模态融合网络(MFNet)的新框架,用于 ICU 患者预后预测。首先,我们结合多个模态特定的编码器来学习不同模态的表示。值得注意的是,设计了一个图引导编码器来捕获医疗代码之间的潜在全局关系,并采用具有预微调策略的文本编码器来提取适当的文本表示。其次,我们提出了一种成对融合多模态表示的方法,采用了一种定制的分层融合机制。在 eICU-CRD 数据集上进行的实验验证了 MFNet 在死亡率预测和住院时间(LoS)预测方面的性能优于各种有代表性的和最先进的基线方法。此外,全面的消融研究证明了 MFNet 每个组件的有效性。