Suppr超能文献

使用本体增强生成对抗网络生成未见疾病患者数据。

Generating unseen diseases patient data using ontology enhanced generative adversarial networks.

作者信息

Sun Chang, Dumontier Michel

机构信息

Institute of Data Science, Faculty of Science and Engineering, Maastricht University, Maastricht, the Netherlands.

Department of Advanced Computing Sciences, Faculty of Science and Engineering, Maastricht University, Maastricht, the Netherlands.

出版信息

NPJ Digit Med. 2025 Jan 3;8(1):4. doi: 10.1038/s41746-024-01421-0.

Abstract

Generating realistic synthetic health data (e.g., electronic health records), holds promise for fundamental research, AI model development, and enhancing data privacy safeguards. Generative Adversarial Networks (GANs) have been employed for this purpose, but their performance is largely constrained by their reliance on training data, rendering them inadequate for rare or previously unseen diseases. This study proposes Onto-CGAN, a novel generative framework that combines knowledge from disease ontologies with GANs to generate unseen diseases that are not present in the training data. The quality of the generated data is evaluated using variable distributions, correlation coefficients, and machine learning model performance. Our findings demonstrate that Onto-CGAN generates unseen diseases with statistical characteristics comparable to the real data, and significantly improves the training of machine learning models. This innovative approach addresses the scarcity of data for rare diseases, offering valuable applications in data augmentation, hypothesis generation, and preclinical validation of clinical models.

摘要

生成逼真的合成健康数据(例如电子健康记录),有望用于基础研究、人工智能模型开发以及加强数据隐私保护。生成对抗网络(GAN)已被用于此目的,但其性能在很大程度上受到对训练数据依赖的限制,使其在处理罕见或前所未见的疾病时显得不足。本研究提出了Onto-CGAN,这是一种新颖的生成框架,它将疾病本体知识与GAN相结合,以生成训练数据中不存在的罕见疾病。使用变量分布、相关系数和机器学习模型性能来评估生成数据的质量。我们的研究结果表明,Onto-CGAN生成的罕见疾病具有与真实数据相当的统计特征,并显著改善了机器学习模型的训练。这种创新方法解决了罕见疾病数据稀缺的问题,在数据增强、假设生成和临床模型的临床前验证方面具有重要应用价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93e7/11699131/2ebcd74f5c40/41746_2024_1421_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验