Suppr超能文献

将结构化和非结构化数据结合用于预测模型:一种深度学习方法。

Combining structured and unstructured data for predictive models: a deep learning approach.

机构信息

Department of Biomedical Informatics, The Ohio State University, 1800 Cannon Drive, Columbus, OH, 43210, USA.

School of Computer Science and Technology, Wuhan University of Technology, Wuhan, 430070, Hubei, China.

出版信息

BMC Med Inform Decis Mak. 2020 Oct 29;20(1):280. doi: 10.1186/s12911-020-01297-6.

Abstract

BACKGROUND

The broad adoption of electronic health records (EHRs) provides great opportunities to conduct health care research and solve various clinical problems in medicine. With recent advances and success, methods based on machine learning and deep learning have become increasingly popular in medical informatics. However, while many research studies utilize temporal structured data on predictive modeling, they typically neglect potentially valuable information in unstructured clinical notes. Integrating heterogeneous data types across EHRs through deep learning techniques may help improve the performance of prediction models.

METHODS

In this research, we proposed 2 general-purpose multi-modal neural network architectures to enhance patient representation learning by combining sequential unstructured notes with structured data. The proposed fusion models leverage document embeddings for the representation of long clinical note documents and either convolutional neural network or long short-term memory networks to model the sequential clinical notes and temporal signals, and one-hot encoding for static information representation. The concatenated representation is the final patient representation which is used to make predictions.

RESULTS

We evaluate the performance of proposed models on 3 risk prediction tasks (i.e. in-hospital mortality, 30-day hospital readmission, and long length of stay prediction) using derived data from the publicly available Medical Information Mart for Intensive Care III dataset. Our results show that by combining unstructured clinical notes with structured data, the proposed models outperform other models that utilize either unstructured notes or structured data only.

CONCLUSIONS

The proposed fusion models learn better patient representation by combining structured and unstructured data. Integrating heterogeneous data types across EHRs helps improve the performance of prediction models and reduce errors.

摘要

背景

电子健康记录(EHR)的广泛采用为医疗保健研究和解决医学中的各种临床问题提供了巨大的机会。随着最近的进步和成功,基于机器学习和深度学习的方法在医学信息学中变得越来越流行。然而,尽管许多研究都利用预测建模的时间结构化数据,但它们通常忽略了来自非结构化临床记录中的潜在有价值信息。通过深度学习技术整合 EHR 中的异构数据类型可能有助于提高预测模型的性能。

方法

在这项研究中,我们提出了 2 种通用的多模态神经网络架构,通过将连续的非结构化笔记与结构化数据相结合,来增强患者表示学习。所提出的融合模型利用文档嵌入来表示长的临床记录文档,并使用卷积神经网络或长短时记忆网络来对连续的临床记录和时间信号进行建模,以及使用独热编码来表示静态信息。串联表示是最终的患者表示,用于进行预测。

结果

我们使用公开的医疗信息集市强化护理 III 数据集(Medical Information Mart for Intensive Care III dataset)中的派生数据,在 3 个风险预测任务(即住院内死亡率、30 天内医院再入院率和住院时间延长预测)上评估所提出模型的性能。我们的结果表明,通过将非结构化临床笔记与结构化数据相结合,所提出的模型优于仅利用非结构化笔记或结构化数据的其他模型。

结论

所提出的融合模型通过结合结构化和非结构化数据来学习更好的患者表示。整合 EHR 中的异构数据类型有助于提高预测模型的性能并减少误差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fa/7596962/e371881c47fa/12911_2020_1297_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验