• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于实现隐私保护回归的离题与值串联

Digression and Value Concatenation to Enable Privacy-Preserving Regression.

作者信息

Li Xiao-Bai, Sarkar Sumit

机构信息

Department of Operations and Information Systems, Manning School of Business, University of Massachusetts Lowell, Lowell, MA 01854 U.S.A. {

Naveen Jindal School of Management, University of Texas at Dallas, Richardson, TX 75080 U.S.A. {

出版信息

MIS Q. 2014 Sep;38(3):679-698. doi: 10.25300/misq/2014/38.3.03.

DOI:10.25300/misq/2014/38.3.03
PMID:26752802
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4703130/
Abstract

Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals' sensitive data. This problem, which we call a "regression attack," has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called , which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis.

摘要

回归技术不仅可用于合理的数据分析,还可用于推断有关个人的隐私信息。在本文中,我们证明了回归树(一种流行的数据分析和数据挖掘技术)可用于有效揭示个人的敏感数据。我们将这个问题称为“回归攻击”,数据隐私文献中尚未解决此问题,并且现有的隐私保护技术不适用于应对此问题。我们提出了一种应对回归攻击的新方法。为防止隐私泄露,我们的方法引入了一种名为 的新度量,该度量在构建回归树模型的过程中评估敏感值泄露风险。具体而言,我们开发了一种算法,该算法使用该度量来修剪树以限制敏感数据的泄露。我们还提出了一种用于数据匿名化的动态值串联方法,与现有方法中常用的用户定义泛化方案相比,该方法能更好地保留数据效用。我们的方法可用于对数值型和类别型数据进行匿名化处理。使用真实世界的金融、经济和医疗数据进行了一项实验研究。实验结果表明,所提出的方法在保护数据隐私的同时,能有效地为研究和分析保留数据质量。

相似文献

1
Digression and Value Concatenation to Enable Privacy-Preserving Regression.用于实现隐私保护回归的离题与值串联
MIS Q. 2014 Sep;38(3):679-698. doi: 10.25300/misq/2014/38.3.03.
2
Privacy preserving data anonymization of spontaneous ADE reporting system dataset.自发不良药物事件报告系统数据集的隐私保护数据匿名化
BMC Med Inform Decis Mak. 2016 Jul 18;16 Suppl 1(Suppl 1):58. doi: 10.1186/s12911-016-0293-4.
3
Anonymizing 1:M microdata with high utility.以高实用性对1:M微数据进行匿名化处理。
Knowl Based Syst. 2017 Jan 1;115:15-26. doi: 10.1016/j.knosys.2016.10.012. Epub 2016 Oct 21.
4
The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss.质量成本:在信息损失最小化的情况下,对生物医学数据进行匿名化处理时实施泛化和抑制。
J Biomed Inform. 2015 Dec;58:37-48. doi: 10.1016/j.jbi.2015.09.007. Epub 2015 Sep 15.
5
Privacy-Preserving Anonymity for Periodical Releases of Spontaneous Adverse Drug Event Reporting Data: Algorithm Development and Validation.自发不良药物事件报告数据定期发布的隐私保护匿名性:算法开发与验证
JMIR Med Inform. 2021 Oct 28;9(10):e28752. doi: 10.2196/28752.
6
Utility-preserving anonymization for health data publishing.用于健康数据发布的效用保持匿名化
BMC Med Inform Decis Mak. 2017 Jul 11;17(1):104. doi: 10.1186/s12911-017-0499-0.
7
An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques.一种基于混沌与扰动技术的高效大数据匿名化算法。
Entropy (Basel). 2018 May 17;20(5):373. doi: 10.3390/e20050373.
8
Protecting Privacy When Sharing and Releasing Data with Multiple Records per Person.在为每人共享和发布多记录数据时保护隐私。
J Assoc Inf Syst. 2020;21(6):1461-1485. doi: 10.17705/1jais.00643.
9
Anonymizing and Sharing Medical Text Records.匿名化与共享医学文本记录
Inf Syst Res. 2017;28(2):332-352. doi: 10.1287/isre.2016.0676. Epub 2017 Apr 12.
10
Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values.医学微观数据的差分隐私发布:一种保护信息属性值的高效实用方法。
BMC Med Inform Decis Mak. 2020 Jul 8;20(1):155. doi: 10.1186/s12911-020-01171-5.

引用本文的文献

1
Protecting Privacy When Sharing and Releasing Data with Multiple Records per Person.在为每人共享和发布多记录数据时保护隐私。
J Assoc Inf Syst. 2020;21(6):1461-1485. doi: 10.17705/1jais.00643.
2
Leveraging interdependencies among platform and complementors in innovation ecosystem.利用创新生态系统中平台和互补者之间的相互依存关系。
PLoS One. 2020 Oct 5;15(10):e0239972. doi: 10.1371/journal.pone.0239972. eCollection 2020.
3
Preserving Patient Privacy When Sharing Same-Disease Data.在共享同病数据时保护患者隐私。
ACM J Data Inf Qual. 2016 Oct;7(4). doi: 10.1145/2956554.
4
Unveiling consumer's privacy paradox behaviour in an economic exchange.揭示经济交换中消费者的隐私悖论行为。
Int J Bus Inf Syst. 2016;23(3):307-329. doi: 10.1504/IJBIS.2016.10000351.