• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从表型-基因型数据中量化隐私信息泄露:链接攻击

Quantification of private information leakage from phenotype-genotype data: linking attacks.

作者信息

Harmanci Arif, Gerstein Mark

机构信息

Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, USA.

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA.

出版信息

Nat Methods. 2016 Mar;13(3):251-6. doi: 10.1038/nmeth.3746. Epub 2016 Feb 1.

DOI:10.1038/nmeth.3746
PMID:26828419
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4834871/
Abstract

Studies on genomic privacy have traditionally focused on identifying individuals using DNA variants. In contrast, molecular phenotype data, such as gene expression levels, are generally assumed to be free of such identifying information. Although there is no explicit genotypic information in phenotype data, adversaries can statistically link phenotypes to genotypes using publicly available genotype-phenotype correlations such as expression quantitative trait loci (eQTLs). This linking can be accurate when high-dimensional data (i.e., many expression levels) are used, and the resulting links can then reveal sensitive information (for example, the fact that an individual has cancer). Here we develop frameworks for quantifying the leakage of characterizing information from phenotype data sets. These frameworks can be used to estimate the leakage from large data sets before release. We also present a general three-step procedure for practically instantiating linking attacks and a specific attack using outlier gene expression levels that is simple yet accurate. Finally, we describe the effectiveness of this outlier attack under different scenarios.

摘要

传统上,基因组隐私研究主要集中在利用DNA变异来识别个体。相比之下,分子表型数据,如基因表达水平,通常被认为不包含此类识别信息。尽管表型数据中没有明确的基因型信息,但对手可以利用公开可用的基因型-表型相关性,如表达数量性状位点(eQTL),通过统计方法将表型与基因型联系起来。当使用高维数据(即许多表达水平)时,这种联系可以很准确,而由此产生的联系可能会揭示敏感信息(例如,个体患有癌症这一事实)。在这里,我们开发了一些框架,用于量化从表型数据集中泄露的特征信息。这些框架可用于在发布之前估计大数据集的信息泄露情况。我们还提出了一个实际实施关联攻击的通用三步程序,以及一种使用异常基因表达水平的特定攻击,该攻击简单但准确。最后,我们描述了这种异常值攻击在不同场景下的有效性。

相似文献

1
Quantification of private information leakage from phenotype-genotype data: linking attacks.从表型-基因型数据中量化隐私信息泄露:链接攻击
Nat Methods. 2016 Mar;13(3):251-6. doi: 10.1038/nmeth.3746. Epub 2016 Feb 1.
2
Understanding the links between privacy and public data sharing.理解隐私与公共数据共享之间的联系。
Nat Methods. 2016 Mar;13(3):211-2. doi: 10.1038/nmeth.3779.
3
Private information leakage from single-cell count matrices.单细胞计数矩阵中的隐私信息泄露。
Cell. 2024 Nov 14;187(23):6537-6549.e10. doi: 10.1016/j.cell.2024.09.012. Epub 2024 Oct 2.
4
Data Sanitization to Reduce Private Information Leakage from Functional Genomics.数据清洗以减少功能基因组学中的私人信息泄露。
Cell. 2020 Nov 12;183(4):905-917.e16. doi: 10.1016/j.cell.2020.09.036.
5
Genome privacy: challenges, technical approaches to mitigate risk, and ethical considerations in the United States.基因组隐私:美国面临的挑战、降低风险的技术方法及伦理考量
Ann N Y Acad Sci. 2017 Jan;1387(1):73-83. doi: 10.1111/nyas.13259. Epub 2016 Sep 28.
6
Inference Attacks and Controls on Genotypes and Phenotypes for Individual Genomic Data.个体基因组数据的基因型和表型的推理攻击与控制。
IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):930-937. doi: 10.1109/TCBB.2018.2810180. Epub 2018 Feb 27.
7
Dynamic consent in the digital age of biology.生物学数字时代的动态同意。
J Prim Health Care. 2013 Sep 1;5(3):259-61.
8
Legal limits to data re-identification.数据重新识别的法律限制。
Science. 2013 Feb 8;339(6120):647. doi: 10.1126/science.339.6120.647-a.
9
Privacy-Preserving Hypothesis Testing for Reduced Cancer Risk on Daily Physical Activity.日常身体活动降低癌症风险的隐私保护假设检验。
J Med Syst. 2018 Apr 4;42(5):90. doi: 10.1007/s10916-018-0930-9.
10
Aberrant gene expression in humans.人类中的异常基因表达。
PLoS Genet. 2015 Jan 24;11(1):e1004942. doi: 10.1371/journal.pgen.1004942. eCollection 2015 Jan.

引用本文的文献

1
FedscGen: privacy-preserving federated batch effect correction of single-cell RNA sequencing data.FedscGen:单细胞RNA测序数据的隐私保护联邦批次效应校正
Genome Biol. 2025 Jul 22;26(1):216. doi: 10.1186/s13059-025-03684-6.
2
Clinical Research Informatics: a Decade-in-Review.临床研究信息学:十年回顾
Yearb Med Inform. 2024 Aug;33(1):127-142. doi: 10.1055/s-0044-1800732. Epub 2025 Apr 8.
3
Secure and scalable gene expression quantification with pQuant.使用pQuant进行安全且可扩展的基因表达定量分析。

本文引用的文献

1
Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing.药物遗传学中的隐私:华法林个体化给药的端到端案例研究。
Proc USENIX Secur Symp. 2014 Aug;2014:17-32.
2
Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans.人类基因组学。基因型-组织表达(GTEx)试点分析:人类多组织基因调控
Science. 2015 May 8;348(6235):648-60. doi: 10.1126/science.1262110. Epub 2015 May 7.
3
Redefining genomic privacy: trust and empowerment.重新定义基因组隐私:信任与赋权。
Nat Commun. 2025 Mar 10;16(1):2380. doi: 10.1038/s41467-025-57393-6.
4
Secure and federated quantitative trait loci mapping with privateQTL.使用privateQTL进行安全且联合的数量性状基因座定位
Cell Genom. 2025 Feb 12;5(2):100769. doi: 10.1016/j.xgen.2025.100769.
5
Proxy panels enable privacy-aware outsourcing of genotype imputation.代理面板实现了基因型填充的隐私保护外包。
Genome Res. 2025 Feb 14;35(2):326-339. doi: 10.1101/gr.278934.124.
6
Privacy of single-cell gene expression data.单细胞基因表达数据的隐私性。
Patterns (N Y). 2024 Nov 8;5(11):101096. doi: 10.1016/j.patter.2024.101096.
7
Synthetic data for privacy-preserving clinical risk prediction.用于保护隐私的临床风险预测的合成数据。
Sci Rep. 2024 Oct 27;14(1):25676. doi: 10.1038/s41598-024-72894-y.
8
Private information leakage from single-cell count matrices.单细胞计数矩阵中的隐私信息泄露。
Cell. 2024 Nov 14;187(23):6537-6549.e10. doi: 10.1016/j.cell.2024.09.012. Epub 2024 Oct 2.
9
FedGMMAT: Federated generalized linear mixed model association tests.FedGMMAT:联邦广义线性混合模型关联测试。
PLoS Comput Biol. 2024 Jul 24;20(7):e1012142. doi: 10.1371/journal.pcbi.1012142. eCollection 2024 Jul.
10
Privacy-preserving model evaluation for logistic and linear regression using homomorphically encrypted genotype data.基于同态加密基因型数据的逻辑回归和线性回归的隐私保护模型评估。
J Biomed Inform. 2024 Aug;156:104678. doi: 10.1016/j.jbi.2024.104678. Epub 2024 Jun 25.
PLoS Biol. 2014 Nov 4;12(11):e1001983. doi: 10.1371/journal.pbio.1001983. eCollection 2014 Nov.
4
Routes for breaching and protecting genetic privacy.突破和保护遗传隐私的途径。
Nat Rev Genet. 2014 Jun;15(6):409-21. doi: 10.1038/nrg3723. Epub 2014 May 8.
5
Transcriptome and genome sequencing uncovers functional variation in humans.转录组和基因组测序揭示了人类功能变异。
Nature. 2013 Sep 26;501(7468):506-11. doi: 10.1038/nature12531. Epub 2013 Sep 15.
6
The Genotype-Tissue Expression (GTEx) project.基因型-组织表达 (GTEx) 项目。
Nat Genet. 2013 Jun;45(6):580-5. doi: 10.1038/ng.2653.
7
Identifying personal genomes by surname inference.姓氏推断识别个人基因组。
Science. 2013 Jan 18;339(6117):321-4. doi: 10.1126/science.1229566.
8
Research ethics. The complexities of genomic identifiability.研究伦理。基因组可识别性的复杂性。
Science. 2013 Jan 18;339(6117):275-6. doi: 10.1126/science.1234593.
9
An integrated map of genetic variation from 1,092 human genomes.1092 个人类基因组遗传变异的综合图谱。
Nature. 2012 Nov 1;491(7422):56-65. doi: 10.1038/nature11632.
10
Forensic identification using a multiplex assay of 47 SNPs.使用47个单核苷酸多态性(SNP)的多重检测法进行法医鉴定。
J Forensic Sci. 2012 Nov;57(6):1448-56. doi: 10.1111/j.1556-4029.2012.02154.x. Epub 2012 Apr 26.