Suppr超能文献

数据清洗以减少功能基因组学中的私人信息泄露。

Data Sanitization to Reduce Private Information Leakage from Functional Genomics.

机构信息

Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA; Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.

Stanford University School of Medicine, Department of Genetics, Stanford, CA 94305, USA.

出版信息

Cell. 2020 Nov 12;183(4):905-917.e16. doi: 10.1016/j.cell.2020.09.036.

Abstract

The generation of functional genomics datasets is surging, because they provide insight into gene regulation and organismal phenotypes (e.g., genes upregulated in cancer). The intent behind functional genomics experiments is not necessarily to study genetic variants, yet they pose privacy concerns due to their use of next-generation sequencing. Moreover, there is a great incentive to broadly share raw reads for better statistical power and general research reproducibility. Thus, we need new modes of sharing beyond traditional controlled-access models. Here, we develop a data-sanitization procedure allowing raw functional genomics reads to be shared while minimizing privacy leakage, enabling principled privacy-utility trade-offs. Our protocol works with traditional Illumina-based assays and newer technologies such as 10x single-cell RNA sequencing. It involves quantifying the privacy leakage in reads by statistically linking study participants to known individuals. We carried out these linkages using data from highly accurate reference genomes and more realistic environmental samples.

摘要

功能基因组学数据集的产生正在蓬勃发展,因为它们提供了对基因调控和生物体表型的深入了解(例如,癌症中上调的基因)。功能基因组学实验的目的不一定是研究遗传变异,但由于它们使用下一代测序技术,因此引起了隐私问题。此外,由于广泛共享原始读取数据可以提高统计能力和研究的可重复性,因此存在广泛共享的强烈动机。因此,我们需要超越传统的受控访问模型的新共享模式。在这里,我们开发了一种数据净化程序,允许在最小化隐私泄露的情况下共享原始功能基因组学读数,从而实现有原则的隐私-效用权衡。我们的协议适用于传统的基于 Illumina 的测定和更新的技术,例如 10x 单细胞 RNA 测序。它涉及通过从统计上将研究参与者与已知个体联系起来来量化读取中的隐私泄露。我们使用来自高度准确的参考基因组和更现实的环境样本的数据进行了这些关联。

相似文献

引用本文的文献

2
Private information leakage from single-cell count matrices.单细胞计数矩阵中的隐私信息泄露。
Cell. 2024 Nov 14;187(23):6537-6549.e10. doi: 10.1016/j.cell.2024.09.012. Epub 2024 Oct 2.
6
Omics Approaches to Investigate the Pathogenesis of Suicide.组学方法研究自杀的发病机制。
Biol Psychiatry. 2024 Dec 15;96(12):919-928. doi: 10.1016/j.biopsych.2024.05.017. Epub 2024 May 29.

本文引用的文献

1
SMaSH: Sample matching using SNPs in humans.SMaSH:基于人类 SNP 进行样本匹配。
BMC Genomics. 2019 Dec 30;20(Suppl 12):1001. doi: 10.1186/s12864-019-6332-7.
3
Revealing the brain's molecular architecture.揭示大脑的分子结构。
Science. 2018 Dec 14;362(6420):1262-1263. doi: 10.1126/science.362.6420.1262.
6
A One-Penny Imputed Genome from Next-Generation Reference Panels.基于新一代参考面板的单分钱估算基因组。
Am J Hum Genet. 2018 Sep 6;103(3):338-348. doi: 10.1016/j.ajhg.2018.07.015. Epub 2018 Aug 9.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验