• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于数据隐私的类别受限聚类与微扰动

Class Restricted Clustering and Micro-Perturbation for Data Privacy.

作者信息

Li Xiao-Bai, Sarkar Sumit

机构信息

Department of Operations and Information Systems, University of Massachusetts Lowell, Lowell, Massachusetts 01854.

出版信息

Manage Sci. 2013 Apr 1;59(4). doi: 10.1287/mnsc.1120.1584.

DOI:10.1287/mnsc.1120.1584
PMID:24307745
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3846357/
Abstract

The extensive use of information technologies by organizations to collect and share personal data has raised strong privacy concerns. To respond to the public's demand for data privacy, a class of clustering-based data masking techniques is increasingly being used for privacy-preserving data sharing and analytics. Traditional clustering-based approaches for masking numeric attributes, while addressing re-identification risks, typically do not consider the disclosure risk of categorical confidential attributes. We propose a new approach to deal with this problem. The proposed method clusters data such that the data points within a group are similar in the non-confidential attribute values whereas the confidential attribute values within a group are . To accomplish this, the clustering method, which is based on a minimum spanning tree (MST) technique, uses two risk-utility tradeoff measures in the growing and pruning stages of the MST technique respectively. As part of our approach we also propose a novel cluster-level micro-perturbation method for masking data that overcomes a common problem of traditional clustering-based methods for data masking, which is their inability to preserve important statistical properties such as the variance of attributes and the covariance across attributes. We show that the mean vector and the covariance matrix of the masked data generated using the micro-perturbation method are unbiased estimates of the original mean vector and covariance matrix. An experimental study on several real-world datasets demonstrates the effectiveness of the proposed approach.

摘要

组织广泛使用信息技术来收集和共享个人数据,这引发了强烈的隐私担忧。为了回应公众对数据隐私的需求,一类基于聚类的数据掩码技术越来越多地用于隐私保护数据共享和分析。传统的基于聚类的数字属性掩码方法在解决重新识别风险的同时,通常不考虑分类机密属性的披露风险。我们提出了一种新方法来处理这个问题。所提出的方法对数据进行聚类,使得组内的数据点在非机密属性值上相似,而组内的机密属性值则是……为了实现这一点,基于最小生成树(MST)技术的聚类方法在MST技术的生长和修剪阶段分别使用两种风险-效用权衡措施。作为我们方法的一部分,我们还提出了一种新颖的用于掩码数据的聚类级微扰动方法,该方法克服了传统基于聚类的数据掩码方法的一个常见问题,即它们无法保留重要的统计属性,如属性的方差和属性间的协方差。我们表明,使用微扰动方法生成的掩码数据的均值向量和协方差矩阵是原始均值向量和协方差矩阵的无偏估计。对几个真实世界数据集的实验研究证明了所提出方法的有效性。

相似文献

1
Class Restricted Clustering and Micro-Perturbation for Data Privacy.用于数据隐私的类别受限聚类与微扰动
Manage Sci. 2013 Apr 1;59(4). doi: 10.1287/mnsc.1120.1584.
2
Privacy-preserving matching of similar patients.相似患者的隐私保护匹配
J Biomed Inform. 2016 Feb;59:285-98. doi: 10.1016/j.jbi.2015.12.004. Epub 2015 Dec 17.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Digression and Value Concatenation to Enable Privacy-Preserving Regression.用于实现隐私保护回归的离题与值串联
MIS Q. 2014 Sep;38(3):679-698. doi: 10.25300/misq/2014/38.3.03.
5
Designing a Novel Approach Using a Greedy and Information-Theoretic Clustering-Based Algorithm for Anonymizing Microdata Sets.设计一种基于贪心和信息论聚类算法的新颖方法,用于对微数据集进行匿名化处理。
Entropy (Basel). 2023 Dec 1;25(12):1613. doi: 10.3390/e25121613.
6
Developing Privacy Solutions for Sharing and Analyzing Healthcare Data.开发用于共享和分析医疗保健数据的隐私解决方案。
Int J Bus Inf Syst. 2013 Jan 1;13(2). doi: 10.1504/IJBIS.2013.054335.
7
Privacy preserving data anonymization of spontaneous ADE reporting system dataset.自发不良药物事件报告系统数据集的隐私保护数据匿名化
BMC Med Inform Decis Mak. 2016 Jul 18;16 Suppl 1(Suppl 1):58. doi: 10.1186/s12911-016-0293-4.
8
Protecting Privacy When Sharing and Releasing Data with Multiple Records per Person.在为每人共享和发布多记录数据时保护隐私。
J Assoc Inf Syst. 2020;21(6):1461-1485. doi: 10.17705/1jais.00643.
9
Anonymizing and Sharing Medical Text Records.匿名化与共享医学文本记录
Inf Syst Res. 2017;28(2):332-352. doi: 10.1287/isre.2016.0676. Epub 2017 Apr 12.
10
Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values.医学微观数据的差分隐私发布:一种保护信息属性值的高效实用方法。
BMC Med Inform Decis Mak. 2020 Jul 8;20(1):155. doi: 10.1186/s12911-020-01171-5.

引用本文的文献

1
Protecting Privacy When Sharing and Releasing Data with Multiple Records per Person.在为每人共享和发布多记录数据时保护隐私。
J Assoc Inf Syst. 2020;21(6):1461-1485. doi: 10.17705/1jais.00643.
2
Anonymizing and Sharing Medical Text Records.匿名化与共享医学文本记录
Inf Syst Res. 2017;28(2):332-352. doi: 10.1287/isre.2016.0676. Epub 2017 Apr 12.
3
Preserving Patient Privacy When Sharing Same-Disease Data.在共享同病数据时保护患者隐私。
ACM J Data Inf Qual. 2016 Oct;7(4). doi: 10.1145/2956554.
4
Unveiling consumer's privacy paradox behaviour in an economic exchange.揭示经济交换中消费者的隐私悖论行为。
Int J Bus Inf Syst. 2016;23(3):307-329. doi: 10.1504/IJBIS.2016.10000351.
5
Pricing and disseminating customer data with privacy awareness.在保护隐私的前提下定价和传播客户数据。
Decis Support Syst. 2014 Mar 1;59:63-73. doi: 10.1016/j.dss.2013.10.006.

本文引用的文献

1
A research agenda for personal health records (PHRs).个人健康记录(PHR)的研究议程。
J Am Med Inform Assoc. 2008 Nov-Dec;15(6):729-36. doi: 10.1197/jamia.M2547. Epub 2008 Aug 28.
2
Standards for privacy of individually identifiable health information. Office of the Assistant Secretary for Planning and Evaluation, DHHS. Final rule.可识别个人身份的健康信息隐私标准。美国卫生与公众服务部规划与评估助理部长办公室。最终规则。
Fed Regist. 2000 Dec 28;65(250):82462-829.