Wang Yuanlong, Yin Changchang, Zhang Ping
Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, USA.
Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210 USA.
medRxiv. 2023 May 26:2023.05.18.23290207. doi: 10.1101/2023.05.18.23290207.
The broad adoption of electronic health records (EHRs) provides great opportunities to conduct healthcare research and solve various clinical problems in medicine. With recent advances and success, methods based on machine learning and deep learning have become increasingly popular in medical informatics. Combining data from multiple modalities may help in predictive tasks. To assess the expectations of multimodal data, we introduce a comprehensive fusion framework designed to integrate temporal variables, medical images, and clinical notes in Electronic Health Record (EHR) for enhanced performance in downstream predictive tasks. Early, joint, and late fusion strategies were employed to effectively combine data from various modalities. Model performance and contribution scores show that multimodal models outperform uni-modal models in various tasks. Additionally, temporal signs contain more information than CXR images and clinical notes in three explored predictive tasks. Therefore, models integrating different data modalities can work better in predictive tasks.
电子健康记录(EHRs)的广泛应用为开展医疗保健研究和解决医学中的各种临床问题提供了巨大机遇。随着近期的进展与成功,基于机器学习和深度学习的方法在医学信息学中越来越受欢迎。整合来自多种模态的数据可能有助于预测任务。为了评估多模态数据的预期效果,我们引入了一个综合融合框架,旨在将时间变量、医学图像和电子健康记录(EHR)中的临床记录整合起来,以提高下游预测任务的性能。采用了早期、联合和晚期融合策略来有效组合来自各种模态的数据。模型性能和贡献得分表明,多模态模型在各种任务中优于单模态模型。此外,在三项探索的预测任务中,时间体征比胸部X光(CXR)图像和临床记录包含更多信息。因此,整合不同数据模态的模型在预测任务中能更好地发挥作用。