Suppr超能文献

使用置信传播和局部差分隐私对序列基因组数据进行隐私保护和鲁棒水印处理。

Privacy-preserving and robust watermarking on sequential genome data using belief propagation and local differential privacy.

作者信息

Öksüz Abdullah Çağlar, Ayday Erman, Güdükbay Uğur

机构信息

Department of Computer Engineering, Bilkent University, Ankara, Turkey.

Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA.

出版信息

Bioinformatics. 2021 Sep 9;37(17):2668-2674. doi: 10.1093/bioinformatics/btab128.

Abstract

MOTIVATION

Genome data is a subject of study for both biology and computer science since the start of the Human Genome Project in 1990. Since then, genome sequencing for medical and social purposes becomes more and more available and affordable. Genome data can be shared on public websites or with service providers (SPs). However, this sharing compromises the privacy of donors even under partial sharing conditions. We mainly focus on the liability aspect ensued by the unauthorized sharing of these genome data. One of the techniques to address the liability issues in data sharing is the watermarking mechanism.

RESULTS

To detect malicious correspondents and SPs-whose aim is to share genome data without individuals' consent and undetected-, we propose a novel watermarking method on sequential genome data using belief propagation algorithm. In our method, we have two criteria to satisfy. (i) Embedding robust watermarks so that the malicious adversaries cannot temper the watermark by modification and are identified with high probability. (ii) Achieving ϵ-local differential privacy in all data sharings with SPs. For the preservation of system robustness against single SP and collusion attacks, we consider publicly available genomic information like Minor Allele Frequency, Linkage Disequilibrium, Phenotype Information and Familial Information. Our proposed scheme achieves 100% detection rate against the single SP attacks with only 3% watermark length. For the worst case scenario of collusion attacks (50% of SPs are malicious), 80% detection is achieved with 5% watermark length and 90% detection is achieved with 10% watermark length. For all cases, the impact of ϵ on precision remained negligible and high privacy is ensured.

AVAILABILITY AND IMPLEMENTATION

https://github.com/acoksuz/PPRW\_SGD\_BPLDP.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

自1990年人类基因组计划启动以来,基因组数据一直是生物学和计算机科学的研究对象。从那时起,用于医学和社会目的的基因组测序变得越来越容易获得且成本越来越低。基因组数据可以在公共网站上共享,也可以与服务提供商(SP)共享。然而,即使在部分共享的情况下,这种共享也会损害捐赠者的隐私。我们主要关注这些基因组数据未经授权共享所引发的责任问题。解决数据共享中责任问题的技术之一是水印机制。

结果

为了检测恶意通信者和服务提供商(其目的是在未经个人同意的情况下共享基因组数据且不被发现),我们提出了一种使用信念传播算法的针对顺序基因组数据的新型水印方法。在我们的方法中,我们要满足两个标准。(i)嵌入鲁棒水印,使恶意对手无法通过修改来篡改水印,并能以高概率被识别。(ii)在与服务提供商的所有数据共享中实现ε-局部差分隐私。为了保持系统对单个服务提供商和勾结攻击的鲁棒性,我们考虑公开可用的基因组信息,如次要等位基因频率、连锁不平衡、表型信息和家族信息。我们提出的方案在水印长度仅为3%时,针对单个服务提供商攻击的检测率达到100%。对于勾结攻击的最坏情况(50%的服务提供商是恶意的),水印长度为5%时检测率达到80%,水印长度为10%时检测率达到90%。在所有情况下,ε对精度的影响仍然可以忽略不计,并确保了高隐私性。

可用性和实现方式

https://github.com/acoksuz/PPRW_SGD_BPLDP。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

9

本文引用的文献

1
Data privacy in the age of personal genomics.个人基因组学时代的数据隐私。
Nat Biotechnol. 2019 Oct;37(10):1115-1117. doi: 10.1038/s41587-019-0271-3.
6
Embedding permanent watermarks in synthetic genes.在合成基因中嵌入永久性水印。
PLoS One. 2012;7(8):e42465. doi: 10.1371/journal.pone.0042465. Epub 2012 Aug 8.
7
Robust relationship inference in genome-wide association studies.全基因组关联研究中的稳健关系推断。
Bioinformatics. 2010 Nov 15;26(22):2867-73. doi: 10.1093/bioinformatics/btq559. Epub 2010 Oct 5.
8
DNA watermarks: a proof of concept.DNA水印:概念验证
BMC Mol Biol. 2008 Apr 21;9:40. doi: 10.1186/1471-2199-9-40.
9
Mechanisms of non-Mendelian inheritance in genetic disease.遗传疾病中的非孟德尔遗传机制。
Hum Mol Genet. 2004 Oct 1;13 Spec No 2:R225-33. doi: 10.1093/hmg/ddh254.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验