Suppr超能文献

动态数据集的差分隐私直方图发布:一种自适应采样方法。

Differentially Private Histogram Publication For Dynamic Datasets: An Adaptive Sampling Approach.

作者信息

Li Haoran, Jiang Xiaoqian, Xiong Li, Liu Jinfei

机构信息

Emory University, Atlanta, GA.

University of California, San Diego, La Jolla, CA.

出版信息

Proc ACM Int Conf Inf Knowl Manag. 2015 Oct;2015:1001-1010. doi: 10.1145/2806416.2806441.

Abstract

Differential privacy has recently become a de facto standard for private statistical data release. Many algorithms have been proposed to generate differentially private histograms or synthetic data. However, most of them focus on "one-time" release of a static dataset and do not adequately address the increasing need of releasing series of dynamic datasets in real time. A straightforward application of existing histogram methods on each snapshot of such dynamic datasets will incur high accumulated error due to the composibility of differential privacy and correlations or overlapping users between the snapshots. In this paper, we address the problem of releasing series of dynamic datasets in real time with differential privacy, using a novel adaptive distance-based sampling approach. Our first method, DSFT, uses a fixed distance threshold and releases a differentially private histogram only when the current snapshot is sufficiently different from the previous one, i.e., with a distance greater than a predefined threshold. Our second method, DSAT, further improves DSFT and uses a dynamic threshold adaptively adjusted by a feedback control mechanism to capture the data dynamics. Extensive experiments on real and synthetic datasets demonstrate that our approach achieves better utility than baseline methods and existing state-of-the-art methods.

摘要

差分隐私最近已成为私有统计数据发布的事实上的标准。已经提出了许多算法来生成差分隐私直方图或合成数据。然而,它们中的大多数都专注于静态数据集的“一次性”发布,没有充分解决实时发布动态数据集系列的日益增长的需求。由于差分隐私的可组合性以及快照之间用户的相关性或重叠性,将现有直方图方法直接应用于此类动态数据集的每个快照会导致高累积误差。在本文中,我们使用一种新颖的基于自适应距离的采样方法来解决使用差分隐私实时发布动态数据集系列的问题。我们的第一种方法DSFT使用固定距离阈值,并且仅当当前快照与前一个快照有足够差异时,即距离大于预定义阈值时,才发布差分隐私直方图。我们的第二种方法DSAT进一步改进了DSFT,并使用由反馈控制机制自适应调整的动态阈值来捕捉数据动态。在真实和合成数据集上进行的大量实验表明,我们的方法比基线方法和现有的最先进方法具有更好的实用性。

相似文献

1
Differentially Private Histogram Publication For Dynamic Datasets: An Adaptive Sampling Approach.
Proc ACM Int Conf Inf Knowl Manag. 2015 Oct;2015:1001-1010. doi: 10.1145/2806416.2806441.
2
Differentially Private Synthesization of Multi-Dimensional Data using Copula Functions.
Adv Database Technol. 2014;2014:475-486. doi: 10.5441/002/edbt.2014.43.
3
DPSynthesizer: Differentially Private Data Synthesizer for Privacy Preserving Data Sharing.
Proceedings VLDB Endowment. 2014 Aug;7(13):1677-1680. doi: 10.14778/2733004.2733059.
4
Robust Fingerprint of Location Trajectories Under Differential Privacy.
Proc Priv Enhanc Technol. 2023 Jul;2023(4):5-20. doi: 10.56553/popets-2023-0095.
5
7
Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?
Methods Inf Med. 2024 May;63(1-02):35-51. doi: 10.1055/a-2385-1355. Epub 2024 Aug 13.
8
SHARE: system design and case studies for statistical health information release.
J Am Med Inform Assoc. 2013 Jan 1;20(1):109-16. doi: 10.1136/amiajnl-2012-001032. Epub 2012 Oct 11.
9
Preserving differential privacy in deep neural networks with relevance-based adaptive noise imposition.
Neural Netw. 2020 May;125:131-141. doi: 10.1016/j.neunet.2020.02.001. Epub 2020 Feb 11.
10
Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning.
Proc Int Conf Data Eng. 2015 Apr;2015:1035-1046. doi: 10.1109/ICDE.2015.7113354.

引用本文的文献

2
Differential privacy EV charging data release based on variable window.
PeerJ Comput Sci. 2021 Apr 22;7:e481. doi: 10.7717/peerj-cs.481. eCollection 2021.
3
The anatomy of a distributed predictive modeling framework: online learning, blockchain network, and consensus algorithm.
JAMIA Open. 2020 Jul 6;3(2):201-208. doi: 10.1093/jamiaopen/ooaa017. eCollection 2020 Jul.
5
Quantifying Differential Privacy in Continuous Data Release Under Temporal Correlations.
IEEE Trans Knowl Data Eng. 2019 Jul;31(7):1281-1295. doi: 10.1109/TKDE.2018.2824328. Epub 2018 Apr 9.
6
Privacy Policy and Technology in Biomedical Data Science.
Annu Rev Biomed Data Sci. 2018 Jul;1:115-129. doi: 10.1146/annurev-biodatasci-080917-013416.
7
Partitioning-based mechanisms under personalized differential privacy.
Adv Knowl Discov Data Min. 2017 May;10234:615-627. doi: 10.1007/978-3-319-57454-7_48. Epub 2017 Apr 23.
8
Quantifying Differential Privacy under Temporal Correlations.
Proc Int Conf Data Eng. 2017 Apr;2017:821-832. doi: 10.1109/ICDE.2017.132. Epub 2017 May 18.

本文引用的文献

1
DPSynthesizer: Differentially Private Data Synthesizer for Privacy Preserving Data Sharing.
Proceedings VLDB Endowment. 2014 Aug;7(13):1677-1680. doi: 10.14778/2733004.2733059.
2
Differentially Private Synthesization of Multi-Dimensional Data using Copula Functions.
Adv Database Technol. 2014;2014:475-486. doi: 10.5441/002/edbt.2014.43.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验