Yilmaz Emre, Ji Tianxi, Ayday Erman, Li Pan
Case Western Reserve University.
Proc ACM Workshop Priv Electron Soc. 2020 Nov;2020:163-179. doi: 10.1145/3411497.3420214. Epub 2020 Nov 9.
Although genomic data has significant impact and widespread usage in medical research, it puts individuals' privacy in danger, even if they anonymously or partially share their genomic data. To address this problem, we present a framework that is inspired from differential privacy for sharing individuals' genomic data while preserving their privacy. We assume an individual with some sensitive portion on her genome (e.g., mutations or single nucleotide polymorphisms - SNPs that reveal sensitive information about the individual) that she does not want to share. The goals of the individual are to (i) preserve the privacy of her sensitive data (considering the correlations between the sensitive and non-sensitive part), (ii) preserve the privacy of interdependent data (data that belongs to other individuals that is correlated with her data), and (iii) share as much non-sensitive data as possible to maximize utility of data sharing. As opposed to traditional differential privacy-based data sharing schemes, the proposed scheme does not intentionally add noise to data; it is based on selective sharing of data points. We observe that traditional differential privacy concept does not capture sharing data in such a setting, and hence we first introduce a privacy notation, -indirect privacy, that addresses data sharing in such settings. We show that the proposed framework does not provide sensitive information to the attacker while it provides a high data sharing utility. We also compare the proposed technique with the previous ones and show our advantage both in terms of privacy and data sharing utility.
尽管基因组数据在医学研究中具有重大影响且应用广泛,但它会危及个人隐私,即便个人以匿名或部分共享其基因组数据的方式也是如此。为解决这一问题,我们提出了一个受差分隐私启发的框架,用于在保护个人隐私的同时共享其基因组数据。我们假设有一个个体,其基因组上存在一些她不想共享的敏感部分(例如,揭示个体敏感信息的突变或单核苷酸多态性——SNPs)。该个体的目标是:(i)保护其敏感数据的隐私(考虑敏感部分与非敏感部分之间的相关性),(ii)保护相互依赖的数据(属于其他个体且与她的数据相关的数据)的隐私,以及(iii)尽可能多地共享非敏感数据,以最大化数据共享的效用。与传统的基于差分隐私的数据共享方案不同,所提出的方案不会故意向数据中添加噪声;它基于数据点的选择性共享。我们观察到传统的差分隐私概念无法适用于这种情况下的数据共享,因此我们首先引入一种隐私表示法——间接隐私,以解决这种情况下的数据共享问题。我们表明,所提出的框架在为攻击者提供高数据共享效用的同时,不会提供敏感信息。我们还将所提出的技术与之前的技术进行了比较,并展示了我们在隐私和数据共享效用方面的优势。