Suppr超能文献

一种以数据接收者为中心的去识别方法,以保留统计属性。

A data recipient centered de-identification method to retain statistical attributes.

作者信息

Gal Tamas S, Tucker Thomas C, Gangopadhyay Aryya, Chen Zhiyuan

机构信息

University of Kentucky, 2365 Harrodsburg Rd., Suite A230, Lexington, KY 40504, USA.

University of Maryland at Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA.

出版信息

J Biomed Inform. 2014 Aug;50:32-45. doi: 10.1016/j.jbi.2014.01.001. Epub 2014 Jan 10.

Abstract

Privacy has always been a great concern of patients and medical service providers. As a result of the recent advances in information technology and the government's push for the use of Electronic Health Record (EHR) systems, a large amount of medical data is collected and stored electronically. This data needs to be made available for analysis but at the same time patient privacy has to be protected through de-identification. Although biomedical researchers often describe their research plans when they request anonymized data, most existing anonymization methods do not use this information when de-identifying the data. As a result, the anonymized data may not be useful for the planned research project. This paper proposes a data recipient centered approach to tailor the de-identification method based on input from the recipient of the data. We demonstrate our approach through an anonymization project for biomedical researchers with specific goals to improve the utility of the anonymized data for statistical models used for their research project. The selected algorithm improves a privacy protection method called Condensation by Aggarwal et al. Our methods were tested and validated on real cancer surveillance data provided by the Kentucky Cancer Registry.

摘要

隐私一直是患者和医疗服务提供者极为关注的问题。由于信息技术的最新进展以及政府对电子健康记录(EHR)系统使用的推动,大量医疗数据以电子方式收集和存储。这些数据需要用于分析,但同时必须通过去识别来保护患者隐私。尽管生物医学研究人员在请求匿名数据时通常会描述他们的研究计划,但大多数现有的匿名化方法在对数据进行去识别时并未使用此信息。因此,匿名化后的数据可能对计划中的研究项目无用。本文提出了一种以数据接收者为中心的方法,根据数据接收者的输入来定制去识别方法。我们通过一个针对生物医学研究人员的匿名化项目展示了我们的方法,该项目具有特定目标,即提高匿名化数据对其研究项目所使用统计模型的效用。所选算法改进了Aggarwal等人提出的一种名为凝聚法的隐私保护方法。我们的方法在肯塔基州癌症登记处提供的真实癌症监测数据上进行了测试和验证。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验