• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于建模电子健康记录的隐私保护生成对抗网络。

Privacy preserving Generative Adversarial Networks to model Electronic Health Records.

机构信息

School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, United Kingdom.

School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, United Kingdom.

出版信息

Neural Netw. 2022 Sep;153:339-348. doi: 10.1016/j.neunet.2022.06.022. Epub 2022 Jun 25.

DOI:10.1016/j.neunet.2022.06.022
PMID:35779443
Abstract

Hospitals and General Practitioner (GP) surgeries within National Health Services (NHS), collect patient information on a routine basis to create personal health records such as family medical history, chronic diseases, medications and dosing. The collected information could be used to build and model various machine learning algorithms, to simplify the task of those working within the NHS. However, such Electronic Health Records are not made publicly available due to privacy concerns. In our paper, we propose a privacy-preserving Generative Adversarial Network (pGAN), which can generate synthetic data of high quality, while preserving the privacy and statistical properties of the source data. pGAN is evaluated on two distinct datasets, one posing as a Classification task, and the other as a Regression task. Privacy score of generated data is calculated using the Nearest Neighbour Adversarial Accuracy. Cosine similarity scores of synthetic data from our proposed model indicate that the data generated is similar in nature, but not identical. Additionally, our proposed model was able to preserve privacy while maintaining high utility. Machine learning models trained on both synthetic data and original data have achieved accuracies of 74.3% and 74.5% respectively on the classification dataset; while they have attained an R2-Score of 0.84 and 0.85 on synthetic and original data of the regression task respectively. Our results, therefore, indicate that synthetic data from the proposed model could replace the use of original data for machine learning while preserving privacy.

摘要

英国国民保健制度(NHS)下的医院和全科医生(GP)诊所会定期收集患者信息,以创建个人健康记录,如家族病史、慢性病、药物和剂量等。这些收集到的信息可用于构建和模拟各种机器学习算法,以简化 NHS 内部工作人员的工作。但是,由于隐私问题,这些电子健康记录并未公开。在我们的论文中,我们提出了一种隐私保护生成对抗网络(pGAN),它可以生成高质量的合成数据,同时保护源数据的隐私和统计特性。pGAN 在两个不同的数据集上进行了评估,一个数据集用于分类任务,另一个数据集用于回归任务。使用最近邻对抗精度计算生成数据的隐私得分。我们提出的模型生成的合成数据的余弦相似性得分表明,生成的数据在性质上相似,但并不完全相同。此外,我们的模型在保持高实用性的同时还能够保护隐私。在分类数据集上,基于合成数据和原始数据训练的机器学习模型的准确率分别达到了 74.3%和 74.5%;而在回归任务的合成数据和原始数据上,它们的 R2 得分分别达到了 0.84 和 0.85。因此,我们的结果表明,所提出模型的合成数据可以替代原始数据用于机器学习,同时保护隐私。

相似文献

1
Privacy preserving Generative Adversarial Networks to model Electronic Health Records.用于建模电子健康记录的隐私保护生成对抗网络。
Neural Netw. 2022 Sep;153:339-348. doi: 10.1016/j.neunet.2022.06.022. Epub 2022 Jun 25.
2
Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy.使用条件生成对抗网络结合差分隐私生成合成个人健康数据。
J Biomed Inform. 2023 Jul;143:104404. doi: 10.1016/j.jbi.2023.104404. Epub 2023 Jun 1.
3
Demonstrating the successful application of synthetic learning in spine surgery for training multi-center models with increased patient privacy.展示了合成学习在脊柱外科中的成功应用,该方法用于训练具有更高患者隐私保护的多中心模型。
Sci Rep. 2023 Aug 1;13(1):12481. doi: 10.1038/s41598-023-39458-y.
4
LDP-GAN : Generative adversarial networks with local differential privacy for patient medical records synthesis.LDP-GAN:用于患者医疗记录合成的具有局部差分隐私的生成对抗网络。
Comput Biol Med. 2024 Jan;168:107738. doi: 10.1016/j.compbiomed.2023.107738. Epub 2023 Nov 19.
5
Synthetic Medical Images for Robust, Privacy-Preserving Training of Artificial Intelligence: Application to Retinopathy of Prematurity Diagnosis.用于人工智能稳健、隐私保护训练的合成医学图像:在早产儿视网膜病变诊断中的应用
Ophthalmol Sci. 2022 Feb 11;2(2):100126. doi: 10.1016/j.xops.2022.100126. eCollection 2022 Jun.
6
Decentralised, collaborative, and privacy-preserving machine learning for multi-hospital data.去中心化、协作和保护隐私的机器学习,适用于多医院数据。
EBioMedicine. 2024 Mar;101:105006. doi: 10.1016/j.ebiom.2024.105006. Epub 2024 Feb 19.
7
Generating sequential electronic health records using dual adversarial autoencoder.使用对偶对抗自动编码器生成连续的电子健康记录。
J Am Med Inform Assoc. 2020 Jul 1;27(9):1411-1419. doi: 10.1093/jamia/ocaa119.
8
Generating Synthetic Health Sensor Data for Privacy-Preserving Wearable Stress Detection.生成用于隐私保护可穿戴压力检测的合成健康传感器数据。
Sensors (Basel). 2024 May 11;24(10):3052. doi: 10.3390/s24103052.
9
Synthesizing time-series wound prognosis factors from electronic medical records using generative adversarial networks.使用生成对抗网络从电子病历中综合时间序列伤口预后因素。
J Biomed Inform. 2022 Jan;125:103972. doi: 10.1016/j.jbi.2021.103972. Epub 2021 Dec 14.
10
Anonymization Through Data Synthesis Using Generative Adversarial Networks (ADS-GAN).基于生成对抗网络的数据合成匿名化(ADS-GAN)。
IEEE J Biomed Health Inform. 2020 Aug;24(8):2378-2388. doi: 10.1109/JBHI.2020.2980262. Epub 2020 Mar 12.

引用本文的文献

1
A novel hybrid convolutional and recurrent neural network model for automatic pituitary adenoma classification using dynamic contrast-enhanced MRI.一种用于使用动态对比增强磁共振成像进行垂体腺瘤自动分类的新型混合卷积循环神经网络模型。
Radiol Phys Technol. 2025 Aug 14. doi: 10.1007/s12194-025-00947-6.
2
Generating synthetic multidimensional molecular time series data for machine learning: considerations.为机器学习生成合成多维分子时间序列数据:注意事项。
Front Syst Biol. 2023 Jul 25;3:1188009. doi: 10.3389/fsysb.2023.1188009. eCollection 2023.
3
Empirical evaluation of artificial intelligence distillation techniques for ascertaining cancer outcomes from electronic health records.
用于从电子健康记录中确定癌症预后的人工智能提炼技术的实证评估。
NPJ Digit Med. 2025 Jun 10;8(1):347. doi: 10.1038/s41746-025-01646-7.
4
Tabular transformer generative adversarial network for heterogeneous distribution in healthcare.用于医疗保健中异构分布的表格变压器生成对抗网络。
Sci Rep. 2025 Mar 25;15(1):10254. doi: 10.1038/s41598-025-93077-3.
5
Security and Privacy in Machine Learning for Health Systems: Strategies and Challenges.机器学习在医疗系统中的安全性和隐私保护:策略与挑战。
Yearb Med Inform. 2023 Aug;32(1):269-281. doi: 10.1055/s-0043-1768731. Epub 2023 Dec 26.
6
Deep convolutional and conditional neural networks for large-scale genomic data generation.深度卷积和条件神经网络在大规模基因组数据生成中的应用。
PLoS Comput Biol. 2023 Oct 30;19(10):e1011584. doi: 10.1371/journal.pcbi.1011584. eCollection 2023 Oct.
7
Non-fungible tokens for the management of health data.用于健康数据管理的非同质化代币。
Nat Med. 2023 Feb;29(2):287-288. doi: 10.1038/s41591-022-02125-2.