• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关于使用大规模缺失值对医学微观数据进行匿名化处理——以FAERS数据集为例的研究

On Anonymizing Medical Microdata with Large-Scale Missing Values - A Case Study with the FAERS Dataset.

作者信息

Hsiao Mei-Hui, Lin Wen-Yang, Hsu Kuang-Yung, Shen Zih-Xun

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:6505-6508. doi: 10.1109/EMBC.2019.8857025.

DOI:10.1109/EMBC.2019.8857025
PMID:31947331
Abstract

As big data analysis becomes one of the main driving forces for productivity and economic growth, the concern of individual privacy disclosure increases as well, especially for applications accessing medical or health data that contain personal information. Most contemporary techniques for privacy preserving data publishing follow a simple assumption-the data of concern is complete, i.e., containing no missing values, which however is not the case in the real world. This paper presents our endeavors on inspecting the effect of missing values upon medical data privacy. In particular, we inspected the US FAERS dataset, a public dataset containing adverse drug events released by US FDA. Following the presumption of current anonymization paradigm-the data should contain no missing values, we investigated three intuitive strategies, including or excluding missing values or executing imputation, to anonymize the FAERS dataset. Our results demonstrate the awkwardness of these intuitive strategies in handling data with a massive amount of missing values. Accordingly, we propose a new strategy, consolidation, and the corresponding privacy protection model and anonymization algorithm. Experimental results show that our method can prevent privacy disclosure and sustain the data utility for ADR signal detection.

摘要

随着大数据分析成为生产力和经济增长的主要驱动力之一,个人隐私泄露问题也日益受到关注,尤其是对于访问包含个人信息的医疗或健康数据的应用程序而言。大多数当代隐私保护数据发布技术都遵循一个简单假设——所关注的数据是完整的,即不包含缺失值,但现实世界中并非如此。本文介绍了我们在研究缺失值对医疗数据隐私影响方面所做的努力。具体而言,我们检查了美国FDA不良事件报告系统(FAERS)数据集,这是美国食品药品监督管理局发布的一个包含药品不良事件的公共数据集。按照当前匿名化范式的假设——数据不应包含缺失值,我们研究了三种直观策略,包括包含或排除缺失值或进行插补,以对FAERS数据集进行匿名化处理。我们的结果表明,这些直观策略在处理存在大量缺失值的数据时存在尴尬之处。因此,我们提出了一种新策略——合并,以及相应的隐私保护模型和匿名化算法。实验结果表明,我们的方法可以防止隐私泄露,并在药物不良反应信号检测中保持数据效用。

相似文献

1
On Anonymizing Medical Microdata with Large-Scale Missing Values - A Case Study with the FAERS Dataset.关于使用大规模缺失值对医学微观数据进行匿名化处理——以FAERS数据集为例的研究
Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:6505-6508. doi: 10.1109/EMBC.2019.8857025.
2
Privacy preserving data anonymization of spontaneous ADE reporting system dataset.自发不良药物事件报告系统数据集的隐私保护数据匿名化
BMC Med Inform Decis Mak. 2016 Jul 18;16 Suppl 1(Suppl 1):58. doi: 10.1186/s12911-016-0293-4.
3
Privacy-Preserving Anonymity for Periodical Releases of Spontaneous Adverse Drug Event Reporting Data: Algorithm Development and Validation.自发不良药物事件报告数据定期发布的隐私保护匿名性:算法开发与验证
JMIR Med Inform. 2021 Oct 28;9(10):e28752. doi: 10.2196/28752.
4
Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values.医学微观数据的差分隐私发布:一种保护信息属性值的高效实用方法。
BMC Med Inform Decis Mak. 2020 Jul 8;20(1):155. doi: 10.1186/s12911-020-01171-5.
5
Utility-preserving anonymization for health data publishing.用于健康数据发布的效用保持匿名化
BMC Med Inform Decis Mak. 2017 Jul 11;17(1):104. doi: 10.1186/s12911-017-0499-0.
6
Anonymizing 1:M microdata with high utility.以高实用性对1:M微数据进行匿名化处理。
Knowl Based Syst. 2017 Jan 1;115:15-26. doi: 10.1016/j.knosys.2016.10.012. Epub 2016 Oct 21.
7
The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss.质量成本:在信息损失最小化的情况下,对生物医学数据进行匿名化处理时实施泛化和抑制。
J Biomed Inform. 2015 Dec;58:37-48. doi: 10.1016/j.jbi.2015.09.007. Epub 2015 Sep 15.
8
Designing a Novel Approach Using a Greedy and Information-Theoretic Clustering-Based Algorithm for Anonymizing Microdata Sets.设计一种基于贪心和信息论聚类算法的新颖方法,用于对微数据集进行匿名化处理。
Entropy (Basel). 2023 Dec 1;25(12):1613. doi: 10.3390/e25121613.
9
Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models.在公共云计算环境中增强观察性医疗结局伙伴关系通用数据模型(OMOP-CDM)匿名性的去标识策略的提出与评估:使用隐私模型对医疗数据进行匿名化。
J Med Internet Res. 2020 Nov 26;22(11):e19597. doi: 10.2196/19597.
10
Privacy-preserving data cube for electronic medical records: An experimental evaluation.用于电子病历的隐私保护数据立方体:实验评估
Int J Med Inform. 2017 Jan;97:33-42. doi: 10.1016/j.ijmedinf.2016.09.008. Epub 2016 Sep 24.

引用本文的文献

1
Algorithms to anonymize structured medical and healthcare data: A systematic review.使结构化医学和医疗保健数据匿名化的算法:一项系统综述。
Front Bioinform. 2022 Dec 22;2:984807. doi: 10.3389/fbinf.2022.984807. eCollection 2022.
2
Privacy-Preserving Anonymity for Periodical Releases of Spontaneous Adverse Drug Event Reporting Data: Algorithm Development and Validation.自发不良药物事件报告数据定期发布的隐私保护匿名性:算法开发与验证
JMIR Med Inform. 2021 Oct 28;9(10):e28752. doi: 10.2196/28752.
3
Improved privacy preserving method for periodical SRS publishing.
周期性 SRS 发布的隐私保护改进方法。
PLoS One. 2021 Apr 22;16(4):e0250457. doi: 10.1371/journal.pone.0250457. eCollection 2021.