• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于私密频率估计的合成数据与局部差分隐私

: synthetic data and local differential privacy for private frequency estimation.

作者信息

Varma Gatha, Chauhan Ritu, Singh Dhananjay

机构信息

Amity Institute of Information Technology, Amity University, Noida, India.

Center for Computational Biology and Bioinformatics, Amity University, Noida, India.

出版信息

Cybersecur (Singap). 2022;5(1):26. doi: 10.1186/s42400-022-00129-6. Epub 2022 Aug 3.

DOI:10.1186/s42400-022-00129-6
PMID:35936976
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9345740/
Abstract

The collection of user attributes by service providers is a double-edged sword. They are instrumental in driving statistical analysis to train more accurate predictive models like recommenders. The analysis of the collected user data includes frequency estimation for categorical attributes. Nonetheless, the users deserve privacy guarantees against inadvertent identity disclosures. Therefore algorithms called frequency oracles were developed to randomize or perturb user attributes and estimate the frequencies of their values. We propose , a frequency oracle that used Randomized Aggregatable Privacy-Preserving Ordinal Response (RAPPOR) and Hadamard Response (HR) for randomization in combination with fake data. The design of a service-oriented architecture must consider two types of complexities, namely computational and communication. The functions of such systems aim to minimize the two complexities and therefore, the choice of privacy-enhancing methods must be a calculated decision. The variant of RAPPOR we had used was realized through bloom filters. A bloom filter is a memory-efficient data structure that offers time complexity of O(1). On the other hand, HR has been proven to give the best communication costs of the order of log(b) for b-bits communication. Therefore, is a step towards frequency oracles that exhibit how privacy provisions of existing methods can be combined with those of fake data to achieve statistical results comparable to the original data. also implemented an adaptive solution enhanced from the work of Arcolezi et al. The use of RAPPOR was found to provide better privacy-utility tradeoffs for specific privacy budgets in both high and general privacy regimes.

摘要

服务提供商收集用户属性是一把双刃剑。它们有助于推动统计分析,以训练更准确的预测模型,如推荐系统。对收集到的用户数据进行分析包括对分类属性的频率估计。尽管如此,用户应得到防止身份意外泄露的隐私保障。因此,开发了称为频率预言机的算法来对用户属性进行随机化或扰动,并估计其值的频率。我们提出了一种频率预言机,它使用随机可聚合隐私保护有序响应(RAPPOR)和哈达玛响应(HR)进行随机化,并结合虚假数据。面向服务架构的设计必须考虑两种类型的复杂性,即计算复杂性和通信复杂性。此类系统的功能旨在最小化这两种复杂性,因此,选择隐私增强方法必须是经过深思熟虑的决定。我们使用的RAPPOR变体是通过布隆过滤器实现的。布隆过滤器是一种内存高效的数据结构,其时间复杂度为O(1)。另一方面,已证明HR在b位通信时能提供最佳的通信成本,约为log(b)。因此,这是朝着频率预言机迈出的一步,展示了如何将现有方法的隐私规定与虚假数据的隐私规定相结合,以获得与原始数据相当的统计结果。我们还实现了一种从Arcolezi等人的工作中增强而来的自适应解决方案。研究发现,在高隐私和一般隐私模式下,对于特定的隐私预算,使用RAPPOR能提供更好的隐私-效用权衡。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/a14f09d8be9a/42400_2022_129_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/38a3d8ef18fd/42400_2022_129_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/251afe6d2244/42400_2022_129_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/61b3e4b5b722/42400_2022_129_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/b0e18f107116/42400_2022_129_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/675ecb45c1fa/42400_2022_129_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/4da3a445f401/42400_2022_129_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/acda7152e0d0/42400_2022_129_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/06733446240a/42400_2022_129_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/e28353433f5c/42400_2022_129_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/9a7c02d179c5/42400_2022_129_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/4276902539ba/42400_2022_129_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/8aacd93c280c/42400_2022_129_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/a14f09d8be9a/42400_2022_129_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/38a3d8ef18fd/42400_2022_129_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/251afe6d2244/42400_2022_129_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/61b3e4b5b722/42400_2022_129_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/b0e18f107116/42400_2022_129_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/675ecb45c1fa/42400_2022_129_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/4da3a445f401/42400_2022_129_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/acda7152e0d0/42400_2022_129_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/06733446240a/42400_2022_129_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/e28353433f5c/42400_2022_129_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/9a7c02d179c5/42400_2022_129_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/4276902539ba/42400_2022_129_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/8aacd93c280c/42400_2022_129_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0953/9345740/a14f09d8be9a/42400_2022_129_Fig13_HTML.jpg

相似文献

1
: synthetic data and local differential privacy for private frequency estimation.用于私密频率估计的合成数据与局部差分隐私
Cybersecur (Singap). 2022;5(1):26. doi: 10.1186/s42400-022-00129-6. Epub 2022 Aug 3.
2
BL0K: A New Stage of Privacy-Preserving Scope for Location-Based Services.BL0K:基于位置服务的隐私保护新范围。
Sensors (Basel). 2019 Feb 8;19(3):696. doi: 10.3390/s19030696.
3
DPSynthesizer: Differentially Private Data Synthesizer for Privacy Preserving Data Sharing.DPSynthesizer:用于隐私保护数据共享的差分隐私数据合成器。
Proceedings VLDB Endowment. 2014 Aug;7(13):1677-1680. doi: 10.14778/2733004.2733059.
4
Approximating Functions with Approximate Privacy for Applications in Signal Estimation and Learning.用于信号估计与学习应用的具有近似隐私性的函数逼近
Entropy (Basel). 2023 May 22;25(5):825. doi: 10.3390/e25050825.
5
Anomaly detection over differential preserved privacy in online social networks.在线社交网络中差分隐私保护下的异常检测。
PLoS One. 2019 Apr 25;14(4):e0215856. doi: 10.1371/journal.pone.0215856. eCollection 2019.
6
Designing an algorithm to preserve privacy for medical record linkage with error-prone data.设计一种算法,在存在错误数据的情况下保护医疗记录链接的隐私。
JMIR Med Inform. 2014 Jan 20;2(1):e2. doi: 10.2196/medinform.3090.
7
Trust information-based privacy architecture for ubiquitous health.基于信任的信息隐私体系结构用于普及的健康。
JMIR Mhealth Uhealth. 2013 Oct 8;1(2):e23. doi: 10.2196/mhealth.2731.
8
Vulnerability- and Diversity-Aware Anonymization of Personally Identifiable Information for Improving User Privacy and Utility of Publishing Data.考虑到数据脆弱性和多样性的可识别个人信息匿名化,以提高用户隐私和发布数据的实用性。
Sensors (Basel). 2017 May 8;17(5):1059. doi: 10.3390/s17051059.
9
Privacy preserving linkage using multiple match-keys.使用多个匹配键的隐私保护链接
Int J Popul Data Sci. 2019 May 23;4(1):1094. doi: 10.23889/ijpds.v4i1.1094.
10
Partitioning-based mechanisms under personalized differential privacy.基于分区的个性化差分隐私机制。
Adv Knowl Discov Data Min. 2017 May;10234:615-627. doi: 10.1007/978-3-319-57454-7_48. Epub 2017 Apr 23.

引用本文的文献

1
Harnessing the power of synthetic data in healthcare: innovation, application, and privacy.利用合成数据在医疗保健领域的力量:创新、应用与隐私。
NPJ Digit Med. 2023 Oct 9;6(1):186. doi: 10.1038/s41746-023-00927-3.

本文引用的文献

1
ARTYCUL: A Privacy-Preserving ML-Driven Framework to Determine the Popularity of a Cultural Exhibit on Display.ARTYCUL:一个隐私保护的机器学习驱动的框架,用于确定正在展出的文化展品的受欢迎程度。
Sensors (Basel). 2021 Feb 22;21(4):1527. doi: 10.3390/s21041527.
2
Pre-Emption of Affliction Severity Using HRV Measurements from a Smart Wearable; Case-Study on SARS-Cov-2 Symptoms.使用智能可穿戴设备的 HRV 测量值预先防范疾病严重程度;SARS-CoV-2 症状的案例研究。
Sensors (Basel). 2020 Dec 10;20(24):7068. doi: 10.3390/s20247068.
3
Optimizing the synthesis of clinical trial data using sequential trees.
使用序贯树优化临床试验数据的合成
J Am Med Inform Assoc. 2021 Jan 15;28(1):3-13. doi: 10.1093/jamia/ocaa249.
4
Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets.数据受限情况下的机器学习:用合成数据增强实验揭示褶皱纸张中的规律。
Sci Adv. 2019 Apr 26;5(4):eaau6792. doi: 10.1126/sciadv.aau6792. eCollection 2019 Apr.
5
Randomized response: a survey technique for eliminating evasive answer bias.随机化回答:一种消除回避性回答偏差的调查技术。
J Am Stat Assoc. 1965 Mar;60(309):63-6.