Suppr超能文献

一种用于分析重新识别风险的博弈论框架。

A game theoretic framework for analyzing re-identification risk.

作者信息

Wan Zhiyu, Vorobeychik Yevgeniy, Xia Weiyi, Clayton Ellen Wright, Kantarcioglu Murat, Ganta Ranjit, Heatherly Raymond, Malin Bradley A

机构信息

Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America.

Center for Biomedical Ethics and Society, Vanderbilt University, Tennessee, United States of America.

出版信息

PLoS One. 2015 Mar 25;10(3):e0120592. doi: 10.1371/journal.pone.0120592. eCollection 2015.

Abstract

Given the potential wealth of insights in personal data the big databases can provide, many organizations aim to share data while protecting privacy by sharing de-identified data, but are concerned because various demonstrations show such data can be re-identified. Yet these investigations focus on how attacks can be perpetrated, not the likelihood they will be realized. This paper introduces a game theoretic framework that enables a publisher to balance re-identification risk with the value of sharing data, leveraging a natural assumption that a recipient only attempts re-identification if its potential gains outweigh the costs. We apply the framework to a real case study, where the value of the data to the publisher is the actual grant funding dollar amounts from a national sponsor and the re-identification gain of the recipient is the fine paid to a regulator for violation of federal privacy rules. There are three notable findings: 1) it is possible to achieve zero risk, in that the recipient never gains from re-identification, while sharing almost as much data as the optimal solution that allows for a small amount of risk; 2) the zero-risk solution enables sharing much more data than a commonly invoked de-identification policy of the U.S. Health Insurance Portability and Accountability Act (HIPAA); and 3) a sensitivity analysis demonstrates these findings are robust to order-of-magnitude changes in player losses and gains. In combination, these findings provide support that such a framework can enable pragmatic policy decisions about de-identified data sharing.

摘要

鉴于大型数据库能够提供的个人数据中蕴含着丰富的潜在见解,许多组织旨在通过共享去标识化数据来在保护隐私的同时共享数据,但又有所担忧,因为各种演示表明此类数据可能会被重新识别。然而,这些调查关注的是如何实施攻击,而非攻击得以实现的可能性。本文引入了一个博弈论框架,该框架能使数据发布者在重新识别风险与数据共享价值之间取得平衡,利用一个自然假设,即接收方只有在其潜在收益超过成本时才会尝试重新识别。我们将该框架应用于一个实际案例研究,其中数据对发布者的价值是来自国家赞助商的实际拨款金额,而接收方的重新识别收益是因违反联邦隐私规则而向监管机构支付的罚款。有三个显著发现:1)有可能实现零风险,即接收方永远无法从重新识别中获利,同时共享的数据量几乎与允许少量风险的最优解决方案一样多;2)零风险解决方案允许共享的数据量比美国《健康保险流通与责任法案》(HIPAA)通常采用的去标识化政策所允许的多得多;3)敏感性分析表明,这些发现在参与者损失和收益的数量级变化方面具有稳健性。综合来看,这些发现为这样一个框架能够为关于去标识化数据共享的务实政策决策提供支持提供了依据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7bc/4373733/89eddfffb0dc/pone.0120592.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验