Suppr超能文献

外包全基因组关联研究的高效验证。

Efficient verification for outsourced genome-wide association studies.

机构信息

Rutgers University, Newark, NJ, USA.

University of Texas Health Science Center at Houston, TX, USA.

出版信息

J Biomed Inform. 2021 May;117:103714. doi: 10.1016/j.jbi.2021.103714. Epub 2021 Mar 10.

Abstract

With cloud computing is being widely adopted in conducting genome-wide association studies (GWAS), how to verify the integrity of outsourced GWAS computation remains to be accomplished. Here, we propose two novel algorithms to generate synthetic SNPs that are indistinguishable from real SNPs. The first method creates synthetic SNPs based on the phenotype vector, while the second approach creates synthetic SNPs based on real SNPs that are most similar to the phenotype vector. The time complexity of the first approach and the second approach is Om and Omlogn, respectively, where m is the number of subjects while n is the number of SNPs. Furthermore, through a game theoretic analysis, we demonstrate that it is possible to incentivize honest behavior by the server by coupling appropriate payoffs with randomized verification. We conduct extensive experiments of our proposed methods, and the results show that beyond a formal adversarial model, when only a few synthetic SNPs are generated and mixed into the real data they cannot be distinguished from the real SNPs even by a variety of predictive machine learning models. We demonstrate that the proposed approach can ensure that logistic regression for GWAS can be outsourced in an efficient and trustworthy way.

摘要

随着云计算在全基因组关联研究(GWAS)中的广泛应用,如何验证外包 GWAS 计算的完整性仍然有待完成。在这里,我们提出了两种新的算法来生成与真实 SNP 无法区分的合成 SNP。第一种方法基于表型向量生成合成 SNP,而第二种方法则基于与表型向量最相似的真实 SNP 生成合成 SNP。第一种方法和第二种方法的时间复杂度分别为 Om 和 Omlogn,其中 m 是受试者的数量,n 是 SNP 的数量。此外,通过博弈论分析,我们证明通过将适当的报酬与随机验证相结合,可以激励服务器的诚实行为。我们对所提出的方法进行了广泛的实验,结果表明,在正式的对抗模型之外,当只生成少量的合成 SNP 并将其混入真实数据中时,即使使用各种预测机器学习模型,也无法将它们与真实 SNP 区分开来。我们证明了所提出的方法可以确保 GWAS 的逻辑回归可以以高效和值得信赖的方式进行外包。

相似文献

1
Efficient verification for outsourced genome-wide association studies.外包全基因组关联研究的高效验证。
J Biomed Inform. 2021 May;117:103714. doi: 10.1016/j.jbi.2021.103714. Epub 2021 Mar 10.
3
Secure count query on encrypted genomic data.加密基因组数据上的安全计数查询。
J Biomed Inform. 2018 May;81:41-52. doi: 10.1016/j.jbi.2018.03.003. Epub 2018 Mar 15.
5
FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption.FORESEE:基于同态加密的全外包安全基因组研究
BMC Med Inform Decis Mak. 2015;15 Suppl 5(Suppl 5):S5. doi: 10.1186/1472-6947-15-S5-S5. Epub 2015 Dec 21.

本文引用的文献

2
FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption.FORESEE:基于同态加密的全外包安全基因组研究
BMC Med Inform Decis Mak. 2015;15 Suppl 5(Suppl 5):S5. doi: 10.1186/1472-6947-15-S5-S5. Epub 2015 Dec 21.
3
Private genome analysis through homomorphic encryption.通过同态加密进行个人基因组分析。
BMC Med Inform Decis Mak. 2015;15 Suppl 5(Suppl 5):S3. doi: 10.1186/1472-6947-15-S5-S3. Epub 2015 Dec 21.
5
Fast Exact Search in Hamming Space With Multi-Index Hashing.基于多索引哈希的 Hamming 空间快速精确搜索。
IEEE Trans Pattern Anal Mach Intell. 2014 Jun;36(6):1107-19. doi: 10.1109/TPAMI.2013.231.
6
Scalable Nearest Neighbor Algorithms for High Dimensional Data.高维数据的可扩展最近邻算法。
IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2227-40. doi: 10.1109/TPAMI.2014.2321376.
7
Warfarin pharmacogenetics.华法林药物遗传学
Trends Cardiovasc Med. 2015 Jan;25(1):33-41. doi: 10.1016/j.tcm.2014.09.001. Epub 2014 Sep 6.
8
Chapter 11: Genome-wide association studies.第十一章:全基因组关联研究。
PLoS Comput Biol. 2012;8(12):e1002822. doi: 10.1371/journal.pcbi.1002822. Epub 2012 Dec 27.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验