Center for Multimodal Imaging and Genetics, University of California at San Diego, La Jolla, CA 92037, USA.
NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of Oslo, Oslo 0424, Norway.
Genetics. 2021 Mar 31;217(3). doi: 10.1093/genetics/iyaa046.
We propose an extended Gaussian mixture model for the distribution of causal effects of common single nucleotide polymorphisms (SNPs) for human complex phenotypes that depends on linkage disequilibrium (LD) and heterozygosity (H), while also allowing for independent components for small and large effects. Using a precise methodology showing how genome-wide association studies (GWASs) summary statistics (z-scores) arise through LD with underlying causal SNPs, we applied the model to GWAS of multiple human phenotypes. Our findings indicated that causal effects are distributed with dependence on total LD and H, whereby SNPs with lower total LD and H are more likely to be causal with larger effects; this dependence is consistent with models of the influence of negative pressure from natural selection. Compared with the basic Gaussian mixture model it is built on, the extended model-primarily through quantification of selection pressure-reproduces with greater accuracy the empirical distributions of z-scores, thus providing better estimates of genetic quantities, such as polygenicity and heritability, that arise from the distribution of causal effects.
我们提出了一种扩展的高斯混合模型,用于人类复杂表型常见单核苷酸多态性 (SNP) 的因果效应分布,该模型取决于连锁不平衡 (LD) 和杂合性 (H),同时也允许小效应和大效应的独立分量。我们使用一种精确的方法来展示全基因组关联研究 (GWAS) 汇总统计数据 (z 分数) 是如何通过与潜在因果 SNP 的 LD 产生的,然后将该模型应用于多种人类表型的 GWAS。我们的研究结果表明,因果效应的分布取决于总 LD 和 H,总 LD 和 H 较低的 SNP 更有可能是因果关系较大的 SNP;这种依赖性与自然选择的负压力影响模型一致。与它所基于的基本高斯混合模型相比,扩展模型主要通过量化选择压力,更准确地再现了 z 分数的经验分布,从而更好地估计了由因果效应分布产生的遗传数量,例如多基因性和遗传性。