• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于保护患者数据免遭基于位置的重新识别的计算模型。

A computational model to protect patient data from location-based re-identification.

作者信息

Malin Bradley

机构信息

Department of Biomedical Informatics, Eskind Biomedical Library, Fourth Floor, 2209 Garland Avenue, Vanderbilt University, Nashville, TN 37232-8340, USA.

出版信息

Artif Intell Med. 2007 Jul;40(3):223-39. doi: 10.1016/j.artmed.2007.04.002. Epub 2007 Jun 1.

DOI:10.1016/j.artmed.2007.04.002
PMID:17544262
Abstract

OBJECTIVE

Health care organizations must preserve a patient's anonymity when disclosing personal data. Traditionally, patient identity has been protected by stripping identifiers from sensitive data such as DNA. However, simple automated methods can re-identify patient data using public information. In this paper, we present a solution to prevent a threat to patient anonymity that arises when multiple health care organizations disclose data. In this setting, a patient's location visit pattern, or "trail", can re-identify seemingly anonymous DNA to patient identity. This threat exists because health care organizations (1) cannot prevent the disclosure of certain types of patient information and (2) do not know how to systematically avoid trail re-identification. In this paper, we develop and evaluate computational methods that health care organizations can apply to disclose patient-specific DNA records that are impregnable to trail re-identification.

METHODS AND MATERIALS

To prevent trail re-identification, we introduce a formal model called k-unlinkability, which enables health care administrators to specify different degrees of patient anonymity. Specifically, k-unlinkability is satisfied when the trail of each DNA record is linkable to no less than k identified records. We present several algorithms that enable health care organizations to coordinate their data disclosure, so that they can determine which DNA records can be shared without violating k-unlinkability. We evaluate the algorithms with the trails of patient populations derived from publicly available hospital discharge databases. Algorithm efficacy is evaluated using metrics based on real world applications, including the number of suppressed records and the number of organizations that disclose records.

RESULTS

Our experiments indicate that it is unnecessary to suppress all patient records that initially violate k-unlinkability. Rather, only portions of the trails need to be suppressed. For example, if each hospital discloses 100% of its data on patients diagnosed with cystic fibrosis, then 48% of the DNA records are 5-unlinkable. A naïve solution would suppress the 52% of the DNA records that violate 5-unlinkability. However, by applying our protection algorithms, the hospitals can disclose 95% of the DNA records, all of which are 5-unlinkable. Similar findings hold for all populations studied.

CONCLUSION

This research demonstrates that patient anonymity can be formally protected in shared databases. Our findings illustrate that significant quantities of patient-specific data can be disclosed with provable protection from trail re-identification. The configurability of our methods allows health care administrators to quantify the effects of different levels of privacy protection and formulate policy accordingly.

摘要

目的

医疗保健机构在披露个人数据时必须保护患者的匿名性。传统上,通过从DNA等敏感数据中去除标识符来保护患者身份。然而,简单的自动化方法可以利用公开信息重新识别患者数据。在本文中,我们提出了一种解决方案,以防止多个医疗保健机构披露数据时对患者匿名性产生的威胁。在这种情况下,患者的位置访问模式或“踪迹”可以将看似匿名的DNA重新识别为患者身份。这种威胁之所以存在,是因为医疗保健机构(1)无法阻止某些类型的患者信息的披露,并且(2)不知道如何系统地避免踪迹重新识别。在本文中,我们开发并评估了计算方法,医疗保健机构可以应用这些方法来披露无法通过踪迹重新识别的患者特定DNA记录。

方法和材料

为了防止踪迹重新识别,我们引入了一个名为k-不可链接性的形式模型,该模型使医疗保健管理人员能够指定不同程度的患者匿名性。具体而言,当每个DNA记录的踪迹可与不少于k个已识别记录链接时,k-不可链接性得到满足。我们提出了几种算法,使医疗保健机构能够协调其数据披露,以便他们可以确定哪些DNA记录可以在不违反k-不可链接性的情况下共享。我们使用从公开可用的医院出院数据库中得出的患者群体的踪迹来评估这些算法。使用基于实际应用的指标来评估算法的有效性,包括被抑制记录的数量和披露记录的机构数量。

结果

我们的实验表明,没有必要抑制所有最初违反k-不可链接性的患者记录。相反,只需要抑制部分踪迹。例如,如果每家医院披露其关于诊断为囊性纤维化患者的100%数据,那么48%的DNA记录是5-不可链接的。一个简单的解决方案会抑制违反5-不可链接性的52%的DNA记录。然而,通过应用我们的保护算法,医院可以披露95%的DNA记录,所有这些记录都是5-不可链接的。所有研究的人群都有类似的发现。

结论

这项研究表明,在共享数据库中可以正式保护患者的匿名性。我们的研究结果表明,可以在可证明防止踪迹重新识别的情况下披露大量患者特定数据。我们方法的可配置性使医疗保健管理人员能够量化不同级别的隐私保护的效果,并据此制定政策。

相似文献

1
A computational model to protect patient data from location-based re-identification.一种用于保护患者数据免遭基于位置的重新识别的计算模型。
Artif Intell Med. 2007 Jul;40(3):223-39. doi: 10.1016/j.artmed.2007.04.002. Epub 2007 Jun 1.
2
Secure construction of k-unlinkable patient records from distributed providers.从分布式提供者那里构建 k 不可链接的患者记录的安全性。
Artif Intell Med. 2010 Jan;48(1):29-41. doi: 10.1016/j.artmed.2009.09.002. Epub 2009 Oct 28.
3
How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems.如何(不)在分布式网络中保护基因组数据隐私:利用踪迹重新识别来评估和设计匿名保护系统。
J Biomed Inform. 2004 Jun;37(3):179-92. doi: 10.1016/j.jbi.2004.04.005.
4
Protecting patient privacy by quantifiable control of disclosures in disseminated databases.通过对分布式数据库中的披露进行可量化控制来保护患者隐私。
Int J Med Inform. 2004 Aug;73(7-8):599-606. doi: 10.1016/j.ijmedinf.2004.05.002.
5
The need to know versus the right to know: privacy of patient medical data in an information-based society.知情权与知晓权:信息社会中患者医疗数据的隐私问题
Suffolk Univ Law Rev. 1997 Winter;30(4):1183-218.
6
From a paper-based transmission of discharge summaries to electronic communication in health care regions.从出院小结的纸质传输到医疗保健地区的电子通信。
Int J Med Inform. 2006 Mar-Apr;75(3-4):209-15. doi: 10.1016/j.ijmedinf.2005.07.018. Epub 2005 Aug 22.
7
Issues in identification and linkage of patient records across an integrated delivery system.综合医疗服务体系中患者记录的识别与关联问题。
J Healthc Inf Manag. 1998 Fall;12(3):43-52.
8
Securing electronic health records without impeding the flow of information.在不阻碍信息流通的情况下保护电子健康记录。
Int J Med Inform. 2007 May-Jun;76(5-6):471-9. doi: 10.1016/j.ijmedinf.2006.09.015. Epub 2007 Jan 3.
9
Confidentiality preserving audits of electronic medical record access.电子病历访问的保密审计
Stud Health Technol Inform. 2007;129(Pt 1):320-4.
10
Using software agents to preserve individual health data confidentiality in micro-scale geographical analyses.在微观地理分析中使用软件代理保护个人健康数据的保密性。
J Biomed Inform. 2006 Apr;39(2):160-70. doi: 10.1016/j.jbi.2005.06.003. Epub 2005 Jul 26.

引用本文的文献

1
Implementing partnership-driven clinical federated electronic health record data sharing networks.实施由合作伙伴驱动的临床联合电子健康记录数据共享网络。
Int J Med Inform. 2016 Sep;93:26-33. doi: 10.1016/j.ijmedinf.2016.05.008. Epub 2016 Jun 1.
2
Genetic data sharing and privacy.遗传数据共享与隐私
Neuroinformatics. 2015 Jan;13(1):1-6. doi: 10.1007/s12021-014-9248-z.
3
Health information security: a case study of three selected medical centers in iran.健康信息安全:伊朗三家选定医疗中心的案例研究
Acta Inform Med. 2013 Mar;21(1):42-5. doi: 10.5455/AIM.2012.21.42-45.
4
Implementation of a deidentified federated data network for population-based cohort discovery.实现基于人群队列发现的去标识联邦数据网络。
J Am Med Inform Assoc. 2012 Jun;19(e1):e60-7. doi: 10.1136/amiajnl-2011-000133. Epub 2011 Aug 26.
5
Identifiability in biobanks: models, measures, and mitigation strategies.生物库中的可识别性:模型、度量和缓解策略。
Hum Genet. 2011 Sep;130(3):383-92. doi: 10.1007/s00439-011-1042-5. Epub 2011 Jul 8.
6
An Entropy Approach to Disclosure Risk Assessment: Lessons from Real Applications and Simulated Domains.一种用于披露风险评估的熵方法:来自实际应用和模拟领域的经验教训。
Decis Support Syst. 2011 Apr 1;51(1):10-20. doi: 10.1016/j.dss.2010.11.014.
7
The disclosure of diagnosis codes can breach research participants' privacy.诊断编码的披露可能会侵犯研究参与者的隐私。
J Am Med Inform Assoc. 2010 May-Jun;17(3):322-7. doi: 10.1136/jamia.2009.002725.