Suppr超能文献

评估多祖先全基因组关联方法:统计功效、群体结构及实际意义。

Evaluating multi-ancestry genome-wide association methods: Statistical power, population structure, and practical implications.

作者信息

Dias Julie-Alexia, Chen Tony, Xing Hua, Wang Xiaoyu, Rodriguez Alex A, Madduri Ravi K, Kraft Peter, Zhang Haoyu

机构信息

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Division of Cancer Epidemiology & Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA; Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., Rockville, MD, USA.

出版信息

Am J Hum Genet. 2025 Aug 28. doi: 10.1016/j.ajhg.2025.08.006.

Abstract

The increasing availability of diverse biobanks has enabled multi-ancestry genome-wide association studies (GWASs) to enhance the discovery of genetic variants across traits and diseases. However, the choice of an optimal method remains debated, due to challenges in statistical power differences across ancestral groups and approaches to account for population structure. Two primary strategies exist: (1) pooled analysis, which combines individuals from all genetic backgrounds into a single dataset while adjusting for population stratification using principal components, increasing the sample size and statistical power but requiring careful control of population stratification; and (2) meta-analysis, which performs ancestry-group-specific GWASs and subsequently combines summary statistics, potentially capturing fine-scale population structure but facing limitations in handling admixed individuals. Using large-scale simulations with varying sample sizes and ancestry compositions, we compare these methods alongside real data analyses of eight continuous and five binary traits from the UK Biobank (N ≈ 324,000) and the All of Us Research Program (N ≈ 207,000). Our results demonstrate that pooled analysis generally exhibits better statistical power while effectively adjusting for population stratification. We further present a theoretical framework linking power differences to allele-frequency variations across populations. These findings, validated across both biobanks, highlight pooled analysis as a powerful and scalable strategy for multi-ancestry GWASs, improving genetic discovery while maintaining rigorous population structure control.

摘要

越来越多的多样化生物样本库使得多祖先全基因组关联研究(GWAS)能够加强对跨性状和疾病的遗传变异的发现。然而,由于不同祖先群体在统计效力上存在差异以及应对群体结构的方法等挑战,最佳方法的选择仍存在争议。存在两种主要策略:(1)合并分析,即将来自所有遗传背景的个体合并到一个数据集中,同时使用主成分调整群体分层,这增加了样本量和统计效力,但需要仔细控制群体分层;(2)荟萃分析,即进行特定祖先群体的GWAS,随后合并汇总统计数据,这可能捕捉到精细尺度的群体结构,但在处理混合个体方面存在局限性。通过使用具有不同样本量和祖先组成的大规模模拟,我们将这些方法与来自英国生物样本库(N≈324,000)和“我们所有人”研究计划(N≈207,000)的八个连续性状和五个二元性状的实际数据分析进行了比较。我们的结果表明,合并分析通常具有更好的统计效力,同时能有效调整群体分层。我们还提出了一个理论框架,将效力差异与不同人群中的等位基因频率变化联系起来。这些在两个生物样本库中都得到验证的发现,突出了合并分析作为多祖先GWAS的一种强大且可扩展的策略,在保持严格的群体结构控制的同时改善了遗传发现。

相似文献

4
Extending Genome-Wide Association Studies to admixed cohorts with high degrees of relatedness.
medRxiv. 2025 Jun 9:2025.05.27.25328444. doi: 10.1101/2025.05.27.25328444.
6
A genome-wide association study of anti-Müllerian hormone (AMH) levels in Samoan women.
medRxiv. 2024 Dec 8:2024.12.05.24318457. doi: 10.1101/2024.12.05.24318457.
7
Fine-mapping in admixed populations using CARMA-X, with applications to Latin American studies.
Am J Hum Genet. 2025 May 1;112(5):1215-1232. doi: 10.1016/j.ajhg.2025.02.020. Epub 2025 Mar 26.
8
Trans-ancestry Genome-Wide Analyses in UK Biobank Yield Novel Risk Loci for Major Depression.
medRxiv. 2025 Feb 24:2025.02.22.25322721. doi: 10.1101/2025.02.22.25322721.
10
Genetic variants influencing liver fat in normal-weight individuals of European ancestry.
JHEP Rep. 2025 May 14;7(8):101453. doi: 10.1016/j.jhepr.2025.101453. eCollection 2025 Aug.

引用本文的文献

1
Genetic association meta-analysis is susceptible to confounding by between-study cryptic relatedness.
bioRxiv. 2025 May 12:2025.05.10.653279. doi: 10.1101/2025.05.10.653279.

本文引用的文献

1
Fine-scale population structure and widespread conservation of genetic effect sizes between human groups across traits.
Nat Genet. 2025 Feb;57(2):379-389. doi: 10.1038/s41588-024-02035-8. Epub 2025 Feb 3.
3
Diversity and scale: Genetic architecture of 2068 traits in the VA Million Veteran Program.
Science. 2024 Jul 19;385(6706):eadj1182. doi: 10.1126/science.adj1182.
4
Multi-ancestry genome-wide association study of kidney cancer identifies 63 susceptibility regions.
Nat Genet. 2024 May;56(5):809-818. doi: 10.1038/s41588-024-01725-7. Epub 2024 Apr 26.
5
An ensemble penalized regression method for multi-ancestry polygenic risk prediction.
Nat Commun. 2024 Apr 15;15(1):3238. doi: 10.1038/s41467-024-47357-7.
7
Admix-kit: an integrated toolkit and pipeline for genetic analyses of admixed populations.
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae148.
8
Genomic data in the All of Us Research Program.
Nature. 2024 Mar;627(8003):340-346. doi: 10.1038/s41586-023-06957-x. Epub 2024 Feb 19.
9
Multi-ancestry genome-wide association meta-analysis of Parkinson's disease.
Nat Genet. 2024 Jan;56(1):27-36. doi: 10.1038/s41588-023-01584-8. Epub 2023 Dec 28.
10
BridgePRS leverages shared genetic effects across ancestries to increase polygenic risk score portability.
Nat Genet. 2024 Jan;56(1):180-186. doi: 10.1038/s41588-023-01583-9. Epub 2023 Dec 20.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验