Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA.
Am J Hum Genet. 2020 Jun 4;106(6):805-817. doi: 10.1016/j.ajhg.2020.04.012. Epub 2020 May 21.
Despite strong transethnic genetic correlations reported in the literature for many complex traits, the non-transferability of polygenic risk scores across populations suggests the presence of population-specific components of genetic architecture. We propose an approach that models GWAS summary data for one trait in two populations to estimate genome-wide proportions of population-specific/shared causal SNPs. In simulations across various genetic architectures, we show that our approach yields approximately unbiased estimates with in-sample LD and slight upward-bias with out-of-sample LD. We analyze nine complex traits in individuals of East Asian and European ancestry, restricting to common SNPs (MAF > 5%), and find that most common causal SNPs are shared by both populations. Using the genome-wide estimates as priors in an empirical Bayes framework, we perform fine-mapping and observe that high-posterior SNPs (for both the population-specific and shared causal configurations) have highly correlated effects in East Asians and Europeans. In population-specific GWAS risk regions, we observe a 2.8× enrichment of shared high-posterior SNPs, suggesting that population-specific GWAS risk regions harbor shared causal SNPs that are undetected in the other GWASs due to differences in LD, allele frequencies, and/or sample size. Finally, we report enrichments of shared high-posterior SNPs in 53 tissue-specific functional categories and find evidence that SNP-heritability enrichments are driven largely by many low-effect common SNPs.
尽管文献中报道了许多复杂性状的跨种族强遗传相关性,但多基因风险评分在人群之间不可转移表明遗传结构存在特定于人群的成分。我们提出了一种方法,该方法可以对两个群体中的一个性状的 GWAS 汇总数据进行建模,以估计全基因组中特定于群体/共享因果 SNP 的比例。在各种遗传结构的模拟中,我们表明我们的方法在样本内 LD 下产生了近似无偏估计,在样本外 LD 下略有向上偏差。我们分析了东亚和欧洲血统个体的 9 种复杂性状,限制在常见 SNP(MAF>5%)上,并发现大多数常见的因果 SNP 是两个群体共有的。我们使用全基因组估计作为经验贝叶斯框架中的先验,进行精细映射,并观察到高后验 SNP(对于特定于群体和共享因果配置)在东亚人和欧洲人中具有高度相关的效应。在特定于群体的 GWAS 风险区域中,我们观察到共享高后验 SNP 富集了 2.8 倍,这表明特定于群体的 GWAS 风险区域中存在共享的因果 SNP,由于 LD、等位基因频率和/或样本大小的差异,在其他 GWAS 中未被检测到。最后,我们报告了 53 个组织特异性功能类别中共享高后验 SNP 的富集情况,并发现 SNP 遗传力富集主要是由许多低效应常见 SNP 驱动的。