Quantitative Life Sciences, McGill University, Montreal, Quebec, Canada.
Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
PLoS Genet. 2023 Dec 28;19(12):e1011104. doi: 10.1371/journal.pgen.1011104. eCollection 2023 Dec.
Identifying causal variants from genome-wide association studies (GWAS) is challenging due to widespread linkage disequilibrium (LD) and the possible existence of multiple causal variants in the same genomic locus. Functional annotations of the genome may help to prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. Classical fine-mapping methods conducting an exhaustive search of variant-level causal configurations have a high computational cost, especially when the underlying genetic architecture and LD patterns are complex. SuSiE provided an iterative Bayesian stepwise selection algorithm for efficient fine-mapping. In this work, we build connections between SuSiE and a paired mean field variational inference algorithm through the implementation of a sparse projection, and propose effective strategies for estimating hyperparameters and summarizing posterior probabilities. Moreover, we incorporate functional annotations into fine-mapping by jointly estimating enrichment weights to derive functionally-informed priors. We evaluate the performance of SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved improved power for fine-mapping with reduced computation time. We demonstrate the utility of SparsePro through fine-mapping of five functional biomarkers of clinically relevant phenotypes. In summary, we have developed an efficient fine-mapping method for integrating summary statistics and functional annotations. Our method can have wide utility in understanding the genetics of complex traits and increasing the yield of functional follow-up studies of GWAS. SparsePro software is available on GitHub at https://github.com/zhwm/SparsePro.
从全基因组关联研究(GWAS)中识别因果变异是具有挑战性的,因为广泛存在连锁不平衡(LD),并且同一基因组位置可能存在多个因果变异。基因组的功能注释可以帮助优先考虑具有生物学相关性的变异,从而提高 GWAS 结果的精细映射。进行变异级因果结构详尽搜索的经典精细映射方法计算成本很高,尤其是当潜在的遗传结构和 LD 模式复杂时。SuSiE 提供了一种迭代贝叶斯逐步选择算法,用于有效的精细映射。在这项工作中,我们通过稀疏投影的实现,在 SuSiE 和配对均值场变分推断算法之间建立联系,并提出了用于估计超参数和总结后验概率的有效策略。此外,我们通过联合估计富集权重,将功能注释纳入精细映射中,以获得功能信息先验。我们使用 UK Biobank 的资源通过广泛的模拟来评估 SparsePro 的性能。与最先进的方法相比,SparsePro 实现了提高精细映射的功效,同时减少了计算时间。我们通过对五个具有临床相关表型的功能生物标志物的精细映射来展示 SparsePro 的实用性。总之,我们开发了一种有效的整合汇总统计和功能注释的精细映射方法。我们的方法在理解复杂性状的遗传学和增加 GWAS 功能后续研究的产量方面具有广泛的应用。SparsePro 软件可在 GitHub 上获得,网址为 https://github.com/zhwm/SparsePro。