Suppr超能文献

健康数据去标识化质量模型的实验比较

An Experimental Comparison of Quality Models for Health Data De-Identification.

作者信息

Eicher Johanna, Kuhn Klaus A, Prasser Fabian

机构信息

Institute of Medical Statistics and Epidemiology, University Hospital rechts der Isar, Technical University of Munich, Germany.

出版信息

Stud Health Technol Inform. 2017;245:704-708.

Abstract

When individual-level health data are shared in biomedical research, the privacy of patients must be protected. This is typically achieved by data de-identification methods, which transform data in such a way that formal privacy requirements are met. In the process, it is important to minimize the loss of information to maintain data quality. Although several models have been proposed for measuring this aspect, it remains unclear which model is best suited for which application. We have therefore performed an extensive experimental comparison. We first implemented several common quality models into the ARX de-identification tool for biomedical data. We then used each model to de-identify a patient discharge dataset covering almost 4 million cases and outputs were analyzed to measure the impact of different quality models on real-world applications. Our results show that different models are best suited for specific applications, but that one model (Non-Uniform Entropy) is particularly well suited for general-purpose use.

摘要

当个体层面的健康数据在生物医学研究中共享时,患者的隐私必须得到保护。这通常通过数据去识别方法来实现,这些方法以满足正式隐私要求的方式对数据进行转换。在此过程中,尽量减少信息损失以保持数据质量非常重要。尽管已经提出了几种模型来衡量这一方面,但仍不清楚哪种模型最适合哪种应用。因此,我们进行了广泛的实验比较。我们首先在用于生物医学数据的ARX去识别工具中实现了几种常见的质量模型。然后,我们使用每个模型对包含近400万个病例的患者出院数据集进行去识别,并对输出进行分析,以衡量不同质量模型对实际应用的影响。我们的结果表明,不同的模型最适合特定的应用,但有一种模型(非均匀熵)特别适合通用用途。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验