• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于全基因组测序数据的特征预测进行个体识别。

Identification of individuals by trait prediction using whole-genome sequencing data.

机构信息

Human Longevity, Inc., Mountain View, CA 94303;

Human Longevity, Inc., Mountain View, CA 94303.

出版信息

Proc Natl Acad Sci U S A. 2017 Sep 19;114(38):10166-10171. doi: 10.1073/pnas.1711125114. Epub 2017 Sep 5.

DOI:10.1073/pnas.1711125114
PMID:28874526
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5617305/
Abstract

Prediction of human physical traits and demographic information from genomic data challenges privacy and data deidentification in personalized medicine. To explore the current capabilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 participants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited. However, we have developed a maximum entropy algorithm that integrates multiple predictions to determine which genomic samples and phenotype measurements originate from the same person. Using this algorithm, we have reidentified an average of >8 of 10 held-out individuals in an ethnically mixed cohort and an average of 5 of either 10 African Americans or 10 Europeans. This work challenges current conceptions of personal privacy and may have far-reaching ethical and legal implications.

摘要

从基因组数据预测人类的身体特征和人口统计学信息,这对个性化医疗中的隐私和数据去识别化构成了挑战。为了探索基于表型的基因组识别的现有能力,我们应用全基因组测序、详细的表型分析和统计建模,对来自不同祖先的 1061 名参与者队列进行了生物特征预测。单独来看,对于很大一部分特征,它们在遗传和人口统计学信息之外的预测准确性是有限的。然而,我们开发了一种最大熵算法,该算法可以整合多个预测结果,以确定哪些基因组样本和表型测量来自同一个人。使用该算法,我们在一个混合种族的队列中平均重新识别了 10 个保留个体中的 8 个以上,平均识别了 10 个非裔美国人或 10 个欧洲人中的 5 个。这项工作挑战了当前个人隐私的概念,可能会产生深远的伦理和法律影响。

相似文献

1
Identification of individuals by trait prediction using whole-genome sequencing data.基于全基因组测序数据的特征预测进行个体识别。
Proc Natl Acad Sci U S A. 2017 Sep 19;114(38):10166-10171. doi: 10.1073/pnas.1711125114. Epub 2017 Sep 5.
2
The Costs of Anonymization: Case Study Using Clinical Data.匿名化的成本:使用临床数据的案例研究
J Med Internet Res. 2024 Apr 24;26:e49445. doi: 10.2196/49445.
3
Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review.生物医学文献中匿名化和去识别化的使用与理解:范围综述
J Med Internet Res. 2019 May 31;21(5):e13484. doi: 10.2196/13484.
4
Towards broadening Forensic DNA Phenotyping beyond pigmentation: Improving the prediction of head hair shape from DNA.朝着拓宽法医 DNA 表型分析的范围前进:从 DNA 预测头发生长形状。
Forensic Sci Int Genet. 2018 Nov;37:241-251. doi: 10.1016/j.fsigen.2018.08.017. Epub 2018 Aug 29.
5
Feasibility of Reidentifying Individuals in Large National Physical Activity Data Sets From Which Protected Health Information Has Been Removed With Use of Machine Learning.利用机器学习对已去除保护健康信息的大型国家体力活动数据集进行重新识别个体的可行性。
JAMA Netw Open. 2018 Dec 7;1(8):e186040. doi: 10.1001/jamanetworkopen.2018.6040.
6
Comparative analysis of the GBLUP, emBayesB, and GWAS algorithms to predict genetic values in large yellow croaker (Larimichthys crocea).GBLUP、emBayesB和GWAS算法对大黄鱼(Larimichthys crocea)遗传值预测的比较分析
BMC Genomics. 2016 Jun 14;17:460. doi: 10.1186/s12864-016-2756-5.
7
Genomic evaluation of feed efficiency component traits in Duroc pigs using 80K, 650K and whole-genome sequence variants.利用 80K、650K 和全基因组序列变异对杜洛克猪饲料效率组成性状进行基因组评估。
Genet Sel Evol. 2018 Apr 6;50(1):14. doi: 10.1186/s12711-018-0387-9.
8
Genomic prediction of complex human traits: relatedness, trait architecture and predictive meta-models.复杂人类性状的基因组预测:亲缘关系、性状结构和预测性元模型。
Hum Mol Genet. 2015 Jul 15;24(14):4167-82. doi: 10.1093/hmg/ddv145. Epub 2015 Apr 26.
9
Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations.基于澳大利亚绵羊群体中推断的全基因组序列数据中选择的变体的基因组预测。
Genet Sel Evol. 2019 Dec 5;51(1):72. doi: 10.1186/s12711-019-0514-2.
10
Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.贝叶斯非线性模型混合方案在基因组预测和QTL定位序列数据中的应用。
BMC Genomics. 2017 Aug 15;18(1):618. doi: 10.1186/s12864-017-4030-x.

引用本文的文献

1
Regulating genome language models: navigating policy challenges at the intersection of AI and genetics.规范基因组语言模型:应对人工智能与遗传学交叉领域的政策挑战
Hum Genet. 2025 Sep 16. doi: 10.1007/s00439-025-02768-4.
2
Genome-scale prediction of gene ontology from mass fingerprints reveals new metabolic gene functions.基于质谱指纹图谱的基因本体论全基因组规模预测揭示了新的代谢基因功能。
Life Sci Alliance. 2025 Sep 10;8(11). doi: 10.26508/lsa.202403154. Print 2025 Nov.
3
Combined genome-wide association study of facial traits in Europeans increases explained variance and improves prediction.欧洲人面部特征的全基因组关联研究联合分析增加了解释方差并改善了预测。
Nat Commun. 2025 Jul 16;16(1):6562. doi: 10.1038/s41467-025-61761-7.
4
Forensic skeletal and molecular anthropology face to face: Combining expertise for identification of human remains.法医骨骼人类学与分子人类学面对面:结合专业知识鉴定人类遗骸。
Ann N Y Acad Sci. 2025 Aug;1550(1):77-107. doi: 10.1111/nyas.15398. Epub 2025 Jul 10.
5
Siamese neural network-enhanced electrocardiography can re-identify anonymized healthcare data.连体神经网络增强型心电图可重新识别匿名医疗数据。
Eur Heart J Digit Health. 2025 Feb 25;6(3):417-426. doi: 10.1093/ehjdh/ztaf011. eCollection 2025 May.
6
Forensic DNA phenotyping: a review on SNP panels, genotyping techniques, and prediction models.法医DNA表型分析:关于单核苷酸多态性(SNP)面板、基因分型技术及预测模型的综述
Forensic Sci Res. 2024 Mar 11;10(1):owae013. doi: 10.1093/fsr/owae013. eCollection 2025 Mar.
7
The effects of loss of Y chromosome on male health.Y染色体缺失对男性健康的影响。
Nat Rev Genet. 2025 May;26(5):320-335. doi: 10.1038/s41576-024-00805-y. Epub 2025 Jan 2.
8
Assessing Privacy Vulnerabilities in Genetic Data Sets: Scoping Review.评估基因数据集的隐私漏洞:范围综述
JMIR Bioinform Biotechnol. 2024 May 27;5:e54332. doi: 10.2196/54332.
9
Astronaut omics and the impact of space on the human body at scale.航天组学与太空对人体的规模化影响。
Nat Commun. 2024 Jun 11;15(1):4952. doi: 10.1038/s41467-024-47237-0.
10
Future-proofing genomic data and consent management: a comprehensive review of technology innovations.未来基因组数据和知情同意管理:技术创新的综合评述。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae021.

本文引用的文献

1
Whole-genome sequencing identifies common-to-rare variants associated with human blood metabolites.全基因组测序鉴定与人类血液代谢物相关的常见到罕见变异。
Nat Genet. 2017 Apr;49(4):568-578. doi: 10.1038/ng.3809. Epub 2017 Mar 6.
2
Deep sequencing of 10,000 human genomes.一万个人类基因组的深度测序。
Proc Natl Acad Sci U S A. 2016 Oct 18;113(42):11901-11906. doi: 10.1073/pnas.1613365113. Epub 2016 Oct 4.
3
A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation.全基因组关联扫描提示 DCHS2、RUNX2、GLI3、PAX1 和 EDAR 基因参与人类面部变异。
Nat Commun. 2016 May 19;7:11616. doi: 10.1038/ncomms11616.
4
Privacy-preserving genomic testing in the clinic: a model using HIV treatment.临床中的隐私保护基因组检测:一种利用艾滋病治疗的模式。
Genet Med. 2016 Aug;18(8):814-22. doi: 10.1038/gim.2015.167. Epub 2016 Jan 14.
5
Genetics of skin color variation in Europeans: genome-wide association studies with functional follow-up.欧洲人肤色变异的遗传学:全基因组关联研究及功能后续研究
Hum Genet. 2015 Aug;134(8):823-35. doi: 10.1007/s00439-015-1559-0. Epub 2015 May 12.
6
The fine-scale genetic structure of the British population.英国人群的精细尺度遗传结构。
Nature. 2015 Mar 19;519(7543):309-314. doi: 10.1038/nature14230.
7
Assay Development and Validation of an 8-SNP Multiplex Test to Predict Eye and Skin Coloration.用于预测眼睛和皮肤颜色的8个单核苷酸多态性(SNP)多重检测的分析方法开发与验证
J Forensic Sci. 2015 Jul;60(4):990-1000. doi: 10.1111/1556-4029.12758. Epub 2015 Mar 17.
8
Forensic DNA Phenotyping: Predicting human appearance from crime scene material for investigative purposes.法医DNA表型分析:为调查目的从犯罪现场材料预测人类外貌特征。
Forensic Sci Int Genet. 2015 Sep;18:33-48. doi: 10.1016/j.fsigen.2015.02.003. Epub 2015 Feb 16.
9
Genetic studies of body mass index yield new insights for obesity biology.遗传研究体重指数为肥胖生物学提供了新的见解。
Nature. 2015 Feb 12;518(7538):197-206. doi: 10.1038/nature14177.
10
Defining the role of common variation in the genomic and biological architecture of adult human height.确定常见变异在成年人类身高的基因组和生物学结构中的作用。
Nat Genet. 2014 Nov;46(11):1173-86. doi: 10.1038/ng.3097. Epub 2014 Oct 5.