• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

合成纵向健康数据的隐私风险评估。

Privacy Risk Assessment for Synthetic Longitudinal Health Data.

机构信息

Knowledge Management, ZB MED - Information Centre for Life Sciences, Cologne, Germany.

Medical Informatics Group, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.

出版信息

Stud Health Technol Inform. 2024 Aug 30;317:270-279. doi: 10.3233/SHTI240867.

DOI:10.3233/SHTI240867
PMID:39234731
Abstract

INTRODUCTION

A modern approach to ensuring privacy when sharing datasets is the use of synthetic data generation methods, which often claim to outperform classic anonymization techniques in the trade-off between data utility and privacy. Recently, it was demonstrated that various deep learning-based approaches are able to generate useful synthesized datasets, often based on domain-specific analyses. However, evaluating the privacy implications of releasing synthetic data remains a challenging problem, especially when the goal is to conform with data protection guidelines.

METHODS

Therefore, the recent privacy risk quantification framework Anonymeter has been built for evaluating multiple possible vulnerabilities, which are specifically based on privacy risks that are considered by the European Data Protection Board, i.e. singling out, linkability, and attribute inference. This framework was applied to a synthetic data generation study from the epidemiological domain, where the synthesization replicates time and age trends previously found in data collected during the DONALD cohort study (1312 participants, 16 time points). The conducted privacy analyses are presented, which place a focus on the vulnerability of outliers.

RESULTS

The resulting privacy scores are discussed, which vary greatly between the different types of attacks.

CONCLUSION

Challenges encountered during their implementation and during the interpretation of their results are highlighted, and it is concluded that privacy risk assessment for synthetic data remains an open problem.

摘要

简介

在共享数据集时,确保隐私的一种现代方法是使用合成数据生成方法,这些方法通常声称在数据效用和隐私之间的权衡中优于经典的匿名化技术。最近,已经证明各种基于深度学习的方法能够生成有用的合成数据集,这些方法通常基于特定于域的分析。然而,评估发布合成数据的隐私影响仍然是一个具有挑战性的问题,特别是当目标是符合数据保护准则时。

方法

因此,最近构建了隐私风险量化框架 Anonymeter,用于评估多种可能的漏洞,这些漏洞特别基于欧洲数据保护委员会考虑的隐私风险,即挑出、可链接性和属性推断。该框架应用于来自流行病学领域的合成数据生成研究,该研究合成了在 DONALD 队列研究(1312 名参与者,16 个时间点)中收集的数据中先前发现的时间和年龄趋势。提出了进行的隐私分析,重点关注异常值的脆弱性。

结果

讨论了产生的隐私分数,这些分数在不同类型的攻击之间差异很大。

结论

强调了在实施过程中以及在解释结果时遇到的挑战,并得出结论,合成数据的隐私风险评估仍然是一个未解决的问题。

相似文献

1
Privacy Risk Assessment for Synthetic Longitudinal Health Data.合成纵向健康数据的隐私风险评估。
Stud Health Technol Inform. 2024 Aug 30;317:270-279. doi: 10.3233/SHTI240867.
2
The Costs of Anonymization: Case Study Using Clinical Data.匿名化的成本:使用临床数据的案例研究
J Med Internet Res. 2024 Apr 24;26:e49445. doi: 10.2196/49445.
3
Protecting Biomedical Data Against Attribute Disclosure.保护生物医学数据免受属性泄露。
Stud Health Technol Inform. 2019 Sep 3;267:207-214. doi: 10.3233/SHTI190829.
4
An innovative privacy preserving technique for incremental datasets on cloud computing.一种用于云计算中增量数据集的创新隐私保护技术。
J Biomed Inform. 2016 Aug;62:107-16. doi: 10.1016/j.jbi.2016.06.011. Epub 2016 Jun 28.
5
Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing.隐私保护生成式深度神经网络支持临床数据共享。
Circ Cardiovasc Qual Outcomes. 2019 Jul;12(7):e005122. doi: 10.1161/CIRCOUTCOMES.118.005122. Epub 2019 Jul 9.
6
Security controls in an integrated Biobank to protect privacy in data sharing: rationale and study design.综合生物样本库中保护数据共享隐私的安全控制措施:基本原理与研究设计。
BMC Med Inform Decis Mak. 2017 Jul 6;17(1):100. doi: 10.1186/s12911-017-0494-5.
7
UK National Data Guardian for Health and Care's Review of Data Security: Trust, better security and opt-outs.英国卫生与社会保健领域国家数据守护者的数据安全审查:信任、更佳的安全性与退出机制。
J Innov Health Inform. 2016 Dec 20;23(3):627-632. doi: 10.14236/jhi.v23i3.909.
8
Navigating ethical quandaries with the privacy dilemma of biomedical datasets.生物医学数据集的隐私困境中的伦理困境
Pac Symp Biocomput. 2020;25:736-738.
9
Privacy Preservation in Patient Information Exchange Systems Based on Blockchain: System Design Study.基于区块链的患者信息交换系统中的隐私保护:系统设计研究。
J Med Internet Res. 2022 Mar 22;24(3):e29108. doi: 10.2196/29108.
10
On the Fidelity-Privacy Tradeoff of Synthetic Cancer Registry Data.合成癌症登记数据的保真度-隐私权衡。
Stud Health Technol Inform. 2024 Aug 22;316:621-625. doi: 10.3233/SHTI240490.