Suppr超能文献

使用对偶对抗自动编码器生成连续的电子健康记录。

Generating sequential electronic health records using dual adversarial autoencoder.

机构信息

Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang, South Korea.

School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA.

出版信息

J Am Med Inform Assoc. 2020 Jul 1;27(9):1411-1419. doi: 10.1093/jamia/ocaa119.

Abstract

OBJECTIVE

Recent studies on electronic health records (EHRs) started to learn deep generative models and synthesize a huge amount of realistic records, in order to address significant privacy issues surrounding the EHR. However, most of them only focus on structured records about patients' independent visits, rather than on chronological clinical records. In this article, we aim to learn and synthesize realistic sequences of EHRs based on the generative autoencoder.

MATERIALS AND METHODS

We propose a dual adversarial autoencoder (DAAE), which learns set-valued sequences of medical entities, by combining a recurrent autoencoder with 2 generative adversarial networks (GANs). DAAE improves the mode coverage and quality of generated sequences by adversarially learning both the continuous latent distribution and the discrete data distribution. Using the MIMIC-III (Medical Information Mart for Intensive Care-III) and UT Physicians clinical databases, we evaluated the performances of DAAE in terms of predictive modeling, plausibility, and privacy preservation.

RESULTS

Our generated sequences of EHRs showed the comparable performances to real data for a predictive modeling task, and achieved the best score in plausibility evaluation conducted by medical experts among all baseline models. In addition, differentially private optimization of our model enables to generate synthetic sequences without increasing the privacy leakage of patients' data.

CONCLUSIONS

DAAE can effectively synthesize sequential EHRs by addressing its main challenges: the synthetic records should be realistic enough not to be distinguished from the real records, and they should cover all the training patients to reproduce the performance of specific downstream tasks.

摘要

目的

最近的电子健康记录 (EHR) 研究开始学习深度生成模型,并综合大量现实记录,以解决围绕 EHR 的重大隐私问题。然而,它们大多只关注患者独立就诊的结构化记录,而不是按时间顺序排列的临床记录。在本文中,我们旨在基于生成式自动编码器学习和综合现实的 EHR 序列。

材料和方法

我们提出了一种双重对抗自动编码器 (DAAE),它通过将循环自动编码器与 2 个生成式对抗网络 (GAN) 相结合,学习医学实体的集值序列。DAAE 通过对抗性学习连续潜在分布和离散数据分布,提高了生成序列的模式覆盖和质量。使用 MIMIC-III(重症监护医疗信息市场-III)和 UT 医生临床数据库,我们从预测建模、真实性和隐私保护的角度评估了 DAAE 的性能。

结果

我们生成的 EHR 序列在预测建模任务方面表现与真实数据相当,并且在所有基线模型中,在医学专家进行的真实性评估中获得了最佳得分。此外,我们模型的差分隐私优化可以生成合成序列,而不会增加患者数据的隐私泄露。

结论

DAAE 可以有效地综合顺序 EHR,解决其主要挑战:合成记录应足够真实,无法与真实记录区分开来,并且应涵盖所有训练患者,以重现特定下游任务的性能。

相似文献

4
Tunable Privacy Risk Evaluation of Generative Adversarial Networks.生成式对抗网络的可调隐私风险评估。
Stud Health Technol Inform. 2024 Aug 22;316:1233-1237. doi: 10.3233/SHTI240634.
8
Lifelong Generative Adversarial Autoencoder.终身生成对抗自动编码器。
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14684-14698. doi: 10.1109/TNNLS.2023.3281091. Epub 2024 Oct 7.

引用本文的文献

6
PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning.PromptEHR:基于提示学习的条件式电子健康记录生成
Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022:2873-2885. doi: 10.18653/v1/2022.emnlp-main.185.
10
Enabling Health Data Sharing with Fine-Grained Privacy.实现具有细粒度隐私的健康数据共享。
Proc ACM Int Conf Inf Knowl Manag. 2023 Oct;2023:131-141. doi: 10.1145/3583780.3614864. Epub 2023 Oct 21.

本文引用的文献

2
Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing.隐私保护生成式深度神经网络支持临床数据共享。
Circ Cardiovasc Qual Outcomes. 2019 Jul;12(7):e005122. doi: 10.1161/CIRCOUTCOMES.118.005122. Epub 2019 Jul 9.
5
Medical Image Synthesis with Context-Aware Generative Adversarial Networks.基于上下文感知生成对抗网络的医学图像合成
Med Image Comput Comput Assist Interv. 2017 Sep;10435:417-425. doi: 10.1007/978-3-319-66179-7_48. Epub 2017 Sep 4.
8

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验