• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习辅助全基因组关联研究的有效推论。

Valid inference for machine learning-assisted genome-wide association studies.

机构信息

Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.

Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA.

出版信息

Nat Genet. 2024 Nov;56(11):2361-2369. doi: 10.1038/s41588-024-01934-0. Epub 2024 Sep 30.

DOI:10.1038/s41588-024-01934-0
PMID:39349818
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11972620/
Abstract

Machine learning (ML) has become increasingly popular in almost all scientific disciplines, including human genetics. Owing to challenges related to sample collection and precise phenotyping, ML-assisted genome-wide association study (GWAS), which uses sophisticated ML techniques to impute phenotypes and then performs GWAS on the imputed outcomes, have become increasingly common in complex trait genetics research. However, the validity of ML-assisted GWAS associations has not been carefully evaluated. Here, we report pervasive risks for false-positive associations in ML-assisted GWAS and introduce Post-Prediction GWAS (POP-GWAS), a statistical framework that redesigns GWAS on ML-imputed outcomes. POP-GWAS ensures valid and powerful statistical inference irrespective of imputation quality and choice of algorithm, requiring only GWAS summary statistics as input. We employed POP-GWAS to perform a GWAS of bone mineral density derived from dual-energy X-ray absorptiometry imaging at 14 skeletal sites, identifying 89 new loci and revealing skeletal site-specific genetic architecture. Our framework offers a robust analytic solution for future ML-assisted GWAS.

摘要

机器学习 (ML) 在几乎所有科学领域都变得越来越流行,包括人类遗传学。由于与样本采集和精确表型相关的挑战,使用复杂 ML 技术进行表型推断,然后对推断结果进行全基因组关联研究 (GWAS) 的 ML 辅助 GWAS 越来越常见于复杂性状遗传学研究中。然而,ML 辅助 GWAS 关联的有效性尚未得到仔细评估。在这里,我们报告了 ML 辅助 GWAS 中普遍存在的假阳性关联风险,并介绍了 Post-Prediction GWAS (POP-GWAS),这是一种重新设计基于 ML 推断结果的 GWAS 的统计框架。POP-GWAS 确保了有效的和强大的统计推断,无论插补质量和算法选择如何,仅需 GWAS 汇总统计信息作为输入。我们使用 POP-GWAS 对来自 14 个骨骼部位的双能 X 射线吸收法成像的骨密度进行了 GWAS,鉴定出 89 个新的基因座,并揭示了骨骼部位特异性的遗传结构。我们的框架为未来的 ML 辅助 GWAS 提供了稳健的分析解决方案。

相似文献

1
Valid inference for machine learning-assisted genome-wide association studies.机器学习辅助全基因组关联研究的有效推论。
Nat Genet. 2024 Nov;56(11):2361-2369. doi: 10.1038/s41588-024-01934-0. Epub 2024 Sep 30.
2
Synthetic surrogates improve power for genome-wide association studies of partially missing phenotypes in population biobanks.合成替代物可提高在人群生物库中对部分缺失表型进行全基因组关联研究的功效。
Nat Genet. 2024 Jul;56(7):1527-1536. doi: 10.1038/s41588-024-01793-9. Epub 2024 Jun 13.
3
How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?基于汇总数据的方法在不同遗传结构下识别表达性状关联的能力有多强?
Pac Symp Biocomput. 2018;23:228-239.
4
GenToS: Use of Orthologous Gene Information to Prioritize Signals from Human GWAS.GenToS:利用直系同源基因信息对人类全基因组关联研究中的信号进行优先级排序。
PLoS One. 2016 Sep 9;11(9):e0162466. doi: 10.1371/journal.pone.0162466. eCollection 2016.
5
Rapid and accurate multi-phenotype imputation for millions of individuals.对数以百万计个体进行快速准确的多表型推算。
Nat Commun. 2025 Jan 4;16(1):387. doi: 10.1038/s41467-024-55496-0.
6
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.未分型标记的全基因组推断准确性及其对关联研究统计效能的影响。
BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.
7
An empirical evaluation of imputation accuracy for association statistics reveals increased type-I error rates in genome-wide associations.一项针对关联统计中插补准确性的实证评估表明,全基因组关联分析中的 I 型错误率增加。
BMC Genet. 2011 Jan 20;12:10. doi: 10.1186/1471-2156-12-10.
8
Multidimensional Bone Density Phenotyping Reveals New Insights Into Genetic Regulation of the Pediatric Skeleton.多维骨密度表型分析揭示了遗传调控儿童骨骼的新见解。
J Bone Miner Res. 2018 May;33(5):812-821. doi: 10.1002/jbmr.3362. Epub 2018 Mar 30.
9
Using GWAS summary data to impute traits for genotyped individuals.利用 GWAS 汇总数据对已基因型个体进行表型推断。
HGG Adv. 2023 Apr 12;4(3):100197. doi: 10.1016/j.xhgg.2023.100197. eCollection 2023 Jul 13.
10
Clinical review: Genome-wide association studies of skeletal phenotypes: what we have learned and where we are headed.临床综述:骨骼表型的全基因组关联研究:我们所学到的和我们的前进方向。
J Clin Endocrinol Metab. 2012 Oct;97(10):E1958-77. doi: 10.1210/jc.2012-1890. Epub 2012 Sep 10.

引用本文的文献

1
Improving plant breeding through AI-supported data integration.通过人工智能支持的数据整合改进植物育种。
Theor Appl Genet. 2025 Jun 2;138(6):132. doi: 10.1007/s00122-025-04910-2.
2
Bridging Genomic Research Disparities in Osteoporosis GWAS: Insights for Diverse Populations.弥合骨质疏松症全基因组关联研究中的基因组研究差距:对不同人群的见解
Curr Osteoporos Rep. 2025 May 24;23(1):24. doi: 10.1007/s11914-025-00917-2.
3
Can AI reveal the next generation of high-impact bone genomics targets?人工智能能否揭示下一代具有重大影响的骨基因组学靶点?
Bone Rep. 2025 Mar 24;25:101839. doi: 10.1016/j.bonr.2025.101839. eCollection 2025 Jun.
4
Genetic association studies using disease liabilities from deep neural networks.利用深度神经网络中的疾病易感性进行基因关联研究。
Am J Hum Genet. 2025 Mar 6;112(3):675-692. doi: 10.1016/j.ajhg.2025.01.019. Epub 2025 Feb 21.
5
ipd: an R package for conducting inference on predicted data.ipd:一个用于对预测数据进行推断的R软件包。
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf055.

本文引用的文献

1
Unsupervised representation learning on high-dimensional clinical data improves genomic discovery and prediction.基于高维临床数据的无监督表示学习可改善基因组发现和预测。
Nat Genet. 2024 Aug;56(8):1604-1613. doi: 10.1038/s41588-024-01831-6. Epub 2024 Jul 8.
2
Synthetic surrogates improve power for genome-wide association studies of partially missing phenotypes in population biobanks.合成替代物可提高在人群生物库中对部分缺失表型进行全基因组关联研究的功效。
Nat Genet. 2024 Jul;56(7):1527-1536. doi: 10.1038/s41588-024-01793-9. Epub 2024 Jun 13.
3
Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries.基于深度学习的人群规模生物库数据表型推断可增加遗传发现。
Nat Genet. 2023 Dec;55(12):2269-2276. doi: 10.1038/s41588-023-01558-w. Epub 2023 Nov 20.
4
Phenotype integration improves power and preserves specificity in biobank-based genetic studies of major depressive disorder.表型整合提高了基于生物库的重度抑郁症遗传研究的功效并保持了特异性。
Nat Genet. 2023 Dec;55(12):2082-2093. doi: 10.1038/s41588-023-01559-9. Epub 2023 Nov 20.
5
Prediction-powered inference.预测驱动的推理。
Science. 2023 Nov 10;382(6671):669-674. doi: 10.1126/science.adi6000. Epub 2023 Nov 9.
6
An atlas of genetic determinants of forearm fracture.前臂骨折的遗传决定因素图谱。
Nat Genet. 2023 Nov;55(11):1820-1830. doi: 10.1038/s41588-023-01527-3. Epub 2023 Nov 2.
7
Plasma proteomic associations with genetics and health in the UK Biobank.英国生物库中血浆蛋白质组与遗传学和健康的关联。
Nature. 2023 Oct;622(7982):329-338. doi: 10.1038/s41586-023-06592-6. Epub 2023 Oct 4.
8
Genome-wide analysis of a model-derived binge eating disorder phenotype identifies risk loci and implicates iron metabolism.基于模型的暴食症表型全基因组分析确定了风险位点,并提示了铁代谢的作用。
Nat Genet. 2023 Sep;55(9):1462-1470. doi: 10.1038/s41588-023-01464-1. Epub 2023 Aug 7.
9
The genetic architecture and evolution of the human skeletal form.人类骨骼形态的遗传结构和演化。
Science. 2023 Jul 21;381(6655):eadf8009. doi: 10.1126/science.adf8009.
10
Bone mineral density loci specific to the skull portray potential pleiotropic effects on craniosynostosis.颅骨特异性骨密度基因座可能对颅缝早闭产生潜在的共效作用。
Commun Biol. 2023 Jul 4;6(1):691. doi: 10.1038/s42003-023-04869-0.