Center for Genetic Epidemiology and Genomics, School of Public Health, Soochow University, Jiangsu, People's Republic of China.
Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Soochow University, Jiangsu, People's Republic of China.
Hum Genet. 2016 Feb;135(2):171-84. doi: 10.1007/s00439-015-1621-y. Epub 2015 Dec 10.
Accurately estimating the distribution and heritability of SNP effects across the genome could help explain the mystery of missing heritability. In this study, we propose a novel statistical method for estimating the distribution and heritability of SNP effects from genome-wide association studies (GWASs), and compare its performance to several existing methods using both simulations and real data. Specifically, we study the full range of GWAS summary results and link observed p values and unobserved effect sizes by (non-central) Chi-square distribution. By modeling the observed full set of association signals using a multinomial distribution, we build a likelihood function of SNP effect sizes using parametric and non-parametric maximum likelihood frameworks. Simulation studies show that the proposed method can accurately estimate effect sizes and the number of associated SNPs. As real applications, we analyze publicly available GWAS summary results for height, body mass index (BMI), and bone mineral density (BMD). Our analyses show that there are over 10,000 SNPs that might be associated with height, and the total heritability attributable to these SNPs exceeds 70 %. The heritabilities for BMI and BMD are ~10 and ~15 %, respectively. The results indicate that the proposed method has the potential to improve the accuracy of estimates of heritability and effect size for common SNPs in large-scale GWAS meta-analyses. These improved estimates may contribute to an enhanced understanding of the genetic basis of complex traits.
准确估计 SNP 效应在基因组中的分布和遗传力可以帮助解释遗传力缺失的谜团。在这项研究中,我们提出了一种从全基因组关联研究(GWAS)中估计 SNP 效应分布和遗传力的新统计方法,并使用模拟和真实数据比较了其与几种现有方法的性能。具体来说,我们研究了 GWAS 汇总结果的全范围,并通过(非中心)卡方分布将观察到的 p 值和未观察到的效应大小联系起来。通过使用多项分布对观察到的整套关联信号进行建模,我们使用参数和非参数最大似然框架构建了 SNP 效应大小的似然函数。模拟研究表明,所提出的方法可以准确估计效应大小和相关 SNP 的数量。作为实际应用,我们分析了公开的身高、体重指数(BMI)和骨密度(BMD)GWAS 汇总结果。我们的分析表明,有超过 10000 个 SNP 可能与身高相关,这些 SNP 归因的遗传力超过 70%。BMI 和 BMD 的遗传力分别约为 10%和 15%。结果表明,该方法有可能提高大规模 GWAS 荟萃分析中常见 SNP 遗传力和效应大小估计的准确性。这些改进的估计可能有助于更好地理解复杂性状的遗传基础。