一种以数据接收者为中心的去识别方法，以保留统计属性。

A data recipient centered de-identification method to retain statistical attributes.

作者信息

Gal Tamas S, Tucker Thomas C, Gangopadhyay Aryya, Chen Zhiyuan

机构信息

University of Kentucky, 2365 Harrodsburg Rd., Suite A230, Lexington, KY 40504, USA.

University of Maryland at Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA.

出版信息

J Biomed Inform. 2014 Aug;50:32-45. doi: 10.1016/j.jbi.2014.01.001. Epub 2014 Jan 10.

DOI:10.1016/j.jbi.2014.01.001

PMID:24412834

Abstract

Privacy has always been a great concern of patients and medical service providers. As a result of the recent advances in information technology and the government's push for the use of Electronic Health Record (EHR) systems, a large amount of medical data is collected and stored electronically. This data needs to be made available for analysis but at the same time patient privacy has to be protected through de-identification. Although biomedical researchers often describe their research plans when they request anonymized data, most existing anonymization methods do not use this information when de-identifying the data. As a result, the anonymized data may not be useful for the planned research project. This paper proposes a data recipient centered approach to tailor the de-identification method based on input from the recipient of the data. We demonstrate our approach through an anonymization project for biomedical researchers with specific goals to improve the utility of the anonymized data for statistical models used for their research project. The selected algorithm improves a privacy protection method called Condensation by Aggarwal et al. Our methods were tested and validated on real cancer surveillance data provided by the Kentucky Cancer Registry.

摘要

隐私一直是患者和医疗服务提供者极为关注的问题。由于信息技术的最新进展以及政府对电子健康记录（EHR）系统使用的推动，大量医疗数据以电子方式收集和存储。这些数据需要用于分析，但同时必须通过去识别来保护患者隐私。尽管生物医学研究人员在请求匿名数据时通常会描述他们的研究计划，但大多数现有的匿名化方法在对数据进行去识别时并未使用此信息。因此，匿名化后的数据可能对计划中的研究项目无用。本文提出了一种以数据接收者为中心的方法，根据数据接收者的输入来定制去识别方法。我们通过一个针对生物医学研究人员的匿名化项目展示了我们的方法，该项目具有特定目标，即提高匿名化数据对其研究项目所使用统计模型的效用。所选算法改进了Aggarwal等人提出的一种名为凝聚法的隐私保护方法。我们的方法在肯塔基州癌症登记处提供的真实癌症监测数据上进行了测试和验证。

相似文献

A data recipient centered de-identification method to retain statistical attributes.

J Biomed Inform. 2014 Aug;50:32-45. doi: 10.1016/j.jbi.2014.01.001. Epub 2014 Jan 10.

A framework to preserve the privacy of electronic health data streams.

J Biomed Inform. 2014 Aug;50:95-106. doi: 10.1016/j.jbi.2014.03.015. Epub 2014 Apr 4.

Privacy-preserving data cube for electronic medical records: An experimental evaluation.

Int J Med Inform. 2017 Jan;97:33-42. doi: 10.1016/j.ijmedinf.2016.09.008. Epub 2016 Sep 24.

Text de-identification for privacy protection: a study of its impact on clinical text information content.

J Biomed Inform. 2014 Aug;50:142-50. doi: 10.1016/j.jbi.2014.01.011. Epub 2014 Feb 3.

Disassociation for electronic health record privacy.

J Biomed Inform. 2014 Aug;50:46-61. doi: 10.1016/j.jbi.2014.05.009. Epub 2014 May 28.

Utility-preserving anonymization for health data publishing.

BMC Med Inform Decis Mak. 2017 Jul 11;17(1):104. doi: 10.1186/s12911-017-0499-0.

δ-dependency for privacy-preserving XML data publishing.

J Biomed Inform. 2014 Aug;50:77-94. doi: 10.1016/j.jbi.2014.01.013. Epub 2014 Feb 8.

Protecting the privacy of individual general practice patient electronic records for geospatial epidemiology research.

Aust N Z J Public Health. 2014 Dec;38(6):548-52. doi: 10.1111/1753-6405.12262. Epub 2014 Oct 12.

A flexible approach to distributed data anonymization.

J Biomed Inform. 2014 Aug;50:62-76. doi: 10.1016/j.jbi.2013.12.002. Epub 2013 Dec 12.

Privacy preserving data anonymization of spontaneous ADE reporting system dataset.

BMC Med Inform Decis Mak. 2016 Jul 18;16 Suppl 1(Suppl 1):58. doi: 10.1186/s12911-016-0293-4.

引用本文的文献

Algorithms to anonymize structured medical and healthcare data: A systematic review.

Front Bioinform. 2022 Dec 22;2:984807. doi: 10.3389/fbinf.2022.984807. eCollection 2022.

Cancer registries and data protection in the age of health digital interoperability in Europe: The perspective of the Italian Network of Cancer Registries (AIRTUM).

Front Oncol. 2022 Dec 6;12:1052057. doi: 10.3389/fonc.2022.1052057. eCollection 2022.

Access to Routinely Collected Clinical Data for Research: A Process Implemented at an Academic Medical Center.

Clin Transl Sci. 2019 May;12(3):231-235. doi: 10.1111/cts.12614. Epub 2019 Feb 12.

Ethics and Epistemology in Big Data Research.

J Bioeth Inq. 2017 Dec;14(4):489-500. doi: 10.1007/s11673-017-9771-3. Epub 2017 Mar 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种以数据接收者为中心的去识别方法，以保留统计属性。

A data recipient centered de-identification method to retain statistical attributes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献