Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
Institute for Advanced Research, Wenzhou Medical University, Wenzhou, Zhejiang, China.
Nat Genet. 2019 Dec;51(12):1749-1755. doi: 10.1038/s41588-019-0530-8. Epub 2019 Nov 25.
The genome-wide association study (GWAS) has been widely used as an experimental design to detect associations between genetic variants and a phenotype. Two major confounding factors, population stratification and relatedness, could potentially lead to inflated GWAS test statistics and hence to spurious associations. Mixed linear model (MLM)-based approaches can be used to account for sample structure. However, genome-wide association (GWA) analyses in biobank samples such as the UK Biobank (UKB) often exceed the capability of most existing MLM-based tools especially if the number of traits is large. Here, we develop an MLM-based tool (fastGWA) that controls for population stratification by principal components and for relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data. We demonstrate by extensive simulations that fastGWA is reliable, robust and highly resource-efficient. We then apply fastGWA to 2,173 traits on array-genotyped and imputed samples from 456,422 individuals and to 2,048 traits on whole-exome-sequenced samples from 46,191 individuals in the UKB.
全基因组关联研究(GWAS)已被广泛用作实验设计,以检测遗传变异与表型之间的关联。两个主要的混杂因素,群体分层和相关性,可能导致 GWAS 检验统计量膨胀,从而导致虚假关联。基于混合线性模型(MLM)的方法可用于解释样本结构。然而,英国生物库(UKB)等生物库样本中的全基因组关联(GWA)分析通常超出了大多数现有基于 MLM 的工具的能力,特别是如果性状数量很大。在这里,我们开发了一种基于 MLM 的工具(fastGWA),该工具通过主成分控制群体分层,通过稀疏遗传关系矩阵控制相关性,用于生物库规模数据的 GWA 分析。我们通过广泛的模拟证明,fastGWA 是可靠的、鲁棒的和高度资源高效的。然后,我们将 fastGWA 应用于 UKB 中 456422 名个体的数组基因分型和导入样本中的 2173 个性状,以及 46191 名个体的全外显子测序样本中的 2048 个性状。
Nat Genet. 2019-11-25
Am J Hum Genet. 2019-12-19
Alzheimers Dement. 2025-7
J Anim Sci Biotechnol. 2025-8-11
bioRxiv. 2025-7-24
Nat Genet. 2018-10-22
Nature. 2018-10-10
Nat Genet. 2018-7
Am J Hum Genet. 2017-7-6
Bioinformatics. 2017-9-1
PLoS Genet. 2017-4-7