Suppr超能文献

针对基因组数据中勾结和关联威胁的隐私保护指纹识别

Privacy-Preserving Fingerprinting Against Collusion and Correlation Threats in Genomic Data.

作者信息

Ji Tianxi, Ayday Erman, Yilmaz Emre, Li Pan

机构信息

Texas Tech University.

Case Western Reserve University.

出版信息

Proc Priv Enhanc Technol. 2024;2024(3):659-673. doi: 10.56553/popets-2024-0098.

Abstract

Sharing genomic databases is critical to the collaborative research in computational biology. A shared database is more informative than specific genome-wide association studies (GWAS) statistics as it enables "do-it-yourself" calculations. Genomic databases involve intellectual efforts from the curator and sensitive information of participants, thus in the course of data sharing, the curator (database owner) should be able to prevent unauthorized redistributions and protect individuals' genomic data privacy. As it becomes increasingly common for a single database be shared with multiple recipients, the shared genomic database should also be robust against collusion attack, where multiple malicious recipients combine their individual copies to forge a pirated one with the hope that none of them can be traced back. The strong correlation among genomic entries also make the shared database vulnerable to attacks that leverage the public correlation models. In this paper, we assess the robustness of shared genomic database under both collusion and correlation threats. To this end, we first develop a novel genomic database fingerprinting scheme, called Gen-Scope. It achieves both copyright protection (by enabling traceability) and privacy preservation (via local differential privacy) for the shared genomic databases. To defend against collusion attacks, we augment Gen-Scope with a powerful traitor tracing technique, i.e., the Tardos codes. Via experiments using a real-world genomic database, we show that Gen-Scope achieves strong fingerprint robustness, e.g., the fingerprint cannot be compromised even if the attacker changes 45% of the entries in its received fingerprinted copy and colluders will be detected with high probability. Additionally, Gen-Scope outperforms the considered baseline methods. Under the same privacy and copyright guarantees, the accuracy of the fingerprinted genomic database obtained by Gen-Scope is around 10% higher than that achieved by the baseline, and in terms of preservations of GWAS statistics, the consistency of variant-phenotype associations can be about 20% higher. Notably, we also empirically show that Gen-Scope can identify at least one of the colluders even if malicious receipts collude after independent correlation attacks.

摘要

共享基因组数据库对于计算生物学的合作研究至关重要。共享数据库比特定的全基因组关联研究(GWAS)统计信息更具信息量,因为它能实现“自己动手”计算。基因组数据库涉及管理者的智力成果和参与者的敏感信息,因此在数据共享过程中,管理者(数据库所有者)应能够防止未经授权的重新分发,并保护个人基因组数据隐私。随着单个数据库与多个接收者共享的情况越来越普遍,共享的基因组数据库还应具备抵御合谋攻击的能力,即多个恶意接收者将各自的副本组合起来伪造一个盗版副本,希望无法追溯到他们中的任何一个。基因组条目之间的强相关性也使得共享数据库容易受到利用公共相关模型的攻击。在本文中,我们评估了共享基因组数据库在合谋和相关性威胁下的鲁棒性。为此,我们首先开发了一种新颖的基因组数据库指纹识别方案,称为Gen-Scope。它为共享基因组数据库实现了版权保护(通过可追溯性)和隐私保护(通过局部差分隐私)。为了抵御合谋攻击,我们用一种强大的叛徒追踪技术(即Tardos码)增强了Gen-Scope。通过使用真实世界基因组数据库进行的实验,我们表明Gen-Scope实现了强大的指纹鲁棒性,例如,即使攻击者更改了其接收到的带指纹副本中的45%的条目,指纹也不会被破解,并且合谋者将被高概率检测到。此外,Gen-Scope优于所考虑的基线方法。在相同的隐私和版权保证下,Gen-Scope获得的带指纹基因组数据库的准确性比基线方法高出约10%,在GWAS统计的保留方面,变异-表型关联的一致性可高出约20%。值得注意的是,我们还通过实证表明,即使恶意接收者在独立相关攻击后合谋,Gen-Scope也能识别出至少一个合谋者。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验