• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过概率建模实现隐私保护数据共享。

Privacy-preserving data sharing via probabilistic modeling.

作者信息

Jälkö Joonas, Lagerspetz Eemil, Haukka Jari, Tarkoma Sasu, Honkela Antti, Kaski Samuel

机构信息

Helsinki Institute for Information Technology (HIIT), Department of Computer Science, Aalto University, Espoo, 00076, Finland.

Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki 00014, Finland.

出版信息

Patterns (N Y). 2021 Jun 7;2(7):100271. doi: 10.1016/j.patter.2021.100271. eCollection 2021 Jul 9.

DOI:10.1016/j.patter.2021.100271
PMID:34286296
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8276015/
Abstract

Differential privacy allows quantifying privacy loss resulting from accession of sensitive personal data. Repeated accesses to underlying data incur increasing loss. Releasing data as privacy-preserving synthetic data would avoid this limitation but would leave open the problem of designing what kind of synthetic data. We propose formulating the problem of private data release through probabilistic modeling. This approach transforms the problem of designing the synthetic data into choosing a model for the data, allowing also the inclusion of prior knowledge, which improves the quality of the synthetic data. We demonstrate empirically, in an epidemiological study, that statistical discoveries can be reliably reproduced from the synthetic data. We expect the method to have broad use in creating high-quality anonymized data twins of key datasets for research.

摘要

差分隐私允许对因加入敏感个人数据而导致的隐私损失进行量化。对基础数据的重复访问会导致越来越大的损失。以隐私保护合成数据的形式发布数据将避免这一限制,但会留下设计何种合成数据的问题。我们建议通过概率建模来阐述私有数据发布问题。这种方法将设计合成数据的问题转化为为数据选择一个模型,同时还允许纳入先验知识,从而提高合成数据的质量。我们在一项流行病学研究中通过实证证明,可以从合成数据中可靠地再现统计发现。我们期望该方法在为研究创建关键数据集的高质量匿名数据孪生体方面有广泛应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/ce617f600c1c/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/bc43609d88f1/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/e8afd2319196/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/2f67a29d32f7/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/c1f3af95d2a4/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/342bb329d525/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/fbf9ffc856e8/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/5231c60c8f99/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/ce617f600c1c/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/bc43609d88f1/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/e8afd2319196/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/2f67a29d32f7/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/c1f3af95d2a4/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/342bb329d525/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/fbf9ffc856e8/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/5231c60c8f99/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e96e/8276015/ce617f600c1c/gr8.jpg

相似文献

1
Privacy-preserving data sharing via probabilistic modeling.通过概率建模实现隐私保护数据共享。
Patterns (N Y). 2021 Jun 7;2(7):100271. doi: 10.1016/j.patter.2021.100271. eCollection 2021 Jul 9.
2
Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy.使用条件生成对抗网络结合差分隐私生成合成个人健康数据。
J Biomed Inform. 2023 Jul;143:104404. doi: 10.1016/j.jbi.2023.104404. Epub 2023 Jun 1.
3
Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values.医学微观数据的差分隐私发布:一种保护信息属性值的高效实用方法。
BMC Med Inform Decis Mak. 2020 Jul 8;20(1):155. doi: 10.1186/s12911-020-01171-5.
4
DPSynthesizer: Differentially Private Data Synthesizer for Privacy Preserving Data Sharing.DPSynthesizer:用于隐私保护数据共享的差分隐私数据合成器。
Proceedings VLDB Endowment. 2014 Aug;7(13):1677-1680. doi: 10.14778/2733004.2733059.
5
Privacy-preserving heterogeneous health data sharing.隐私保护的异构健康数据共享。
J Am Med Inform Assoc. 2013 May 1;20(3):462-9. doi: 10.1136/amiajnl-2012-001027. Epub 2012 Dec 13.
6
Differential privacy under dependent tuples-the case of genomic privacy.相依元组下的差分隐私-基因组隐私案例。
Bioinformatics. 2020 Mar 1;36(6):1696-1703. doi: 10.1093/bioinformatics/btz837.
7
Generalized genomic data sharing for differentially private federated learning.用于差分隐私联邦学习的广义基因组数据共享
J Biomed Inform. 2022 Aug;132:104113. doi: 10.1016/j.jbi.2022.104113. Epub 2022 Jun 9.
8
Privacy preserving Generative Adversarial Networks to model Electronic Health Records.用于建模电子健康记录的隐私保护生成对抗网络。
Neural Netw. 2022 Sep;153:339-348. doi: 10.1016/j.neunet.2022.06.022. Epub 2022 Jun 25.
9
Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?差分隐私合成数据是否会导致合成发现?
Methods Inf Med. 2024 May;63(1-02):35-51. doi: 10.1055/a-2385-1355. Epub 2024 Aug 13.
10
Differentially Private Histogram Publication For Dynamic Datasets: An Adaptive Sampling Approach.动态数据集的差分隐私直方图发布:一种自适应采样方法。
Proc ACM Int Conf Inf Knowl Manag. 2015 Oct;2015:1001-1010. doi: 10.1145/2806416.2806441.

引用本文的文献

1
HydraGAN: A Cooperative Agent Model for Multi-Objective Data Generation.九头蛇生成对抗网络(HydraGAN):一种用于多目标数据生成的协作代理模型。
ACM Trans Intell Syst Technol. 2024 Jun;15(3). doi: 10.1145/3653982. Epub 2024 May 17.
2
Collaborative learning from distributed data with differentially private synthetic data.基于差分隐私合成数据的分布式数据协同学习。
BMC Med Inform Decis Mak. 2024 Jun 14;24(1):167. doi: 10.1186/s12911-024-02563-7.
3
Recent Developments in Privacy-Preserving Mining of Clinical Data.临床数据隐私保护挖掘的最新进展

本文引用的文献

1
Excess mortality in Finnish diabetic subjects due to alcohol, accidents and suicide: a nationwide study.芬兰糖尿病患者因酗酒、事故和自杀导致的超额死亡率:一项全国性研究。
Eur J Endocrinol. 2018 Oct 12;179(5):299-306. doi: 10.1530/EJE-18-0351.
2
Cancer risk among insulin users: comparing analogues with human insulin in the CARING five-country cohort study.胰岛素使用者的癌症风险:在 CARING 五国队列研究中比较类似物与人胰岛素。
Diabetologia. 2017 Sep;60(9):1691-1703. doi: 10.1007/s00125-017-4312-5. Epub 2017 Jun 1.
ACM IMS Trans Data Sci. 2021 Nov;2(4). doi: 10.1145/3447774.