Benner Christian, Mahajan Anubha, Pirinen Matti
Genentech, South San Francisco, California, United States of America.
Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland.
PLoS Genet. 2025 Jan 9;21(1):e1011480. doi: 10.1371/journal.pgen.1011480. eCollection 2025 Jan.
Recent statistical approaches have shown that the set of all available genetic variants explains considerably more phenotypic variance of complex traits and diseases than the individual variants that are robustly associated with these phenotypes. However, rapidly increasing sample sizes constantly improve detection and prioritization of individual variants driving the associations between genomic regions and phenotypes. Therefore, it is useful to routinely estimate how much phenotypic variance the detected variants explain for each region by taking into account the correlation structure of variants and the uncertainty in their causal status. Here we extend the FINEMAP software to estimate the effect sizes and regional heritability under the probabilistic model that assumes a handful of causal variants per region. Using the UK Biobank (UKB) data to simulate genomic regions, we demonstrate that FINEMAP provides higher precision and enables more detailed decomposition of regional heritability into individual variants than the variance component model implemented in BOLT or the fixed-effect model implemented in HESS, particularly when there are only a few causal variants in the fine-mapped region. Using data from 2,940 plasma proteins from the UKB study, we observed that on average FINEMAP identified 2.5 causal variants at an association signal and captured 36% more regional heritability than the variant with the lowest P-value. We estimate that in genomic regions with notable contribution to the total heritability, FINEMAP captures on average 13% and 40% more heritability than BOLT and HESS respectively. Our analysis shows how FINEMAP, BOLT and HESS relate to each other in cases where inference of a variant-level picture of the regional genetic architecture is possible.
最近的统计方法表明,所有可用的遗传变异集所解释的复杂性状和疾病的表型变异,比与这些表型有强关联的单个变异要多得多。然而,样本量的迅速增加不断提高了对驱动基因组区域与表型之间关联的单个变异的检测和优先级排序。因此,通过考虑变异的相关结构及其因果状态的不确定性,常规估计检测到的变异对每个区域的表型变异解释程度是很有用的。在这里,我们扩展了FINEMAP软件,以在假设每个区域有少数因果变异的概率模型下估计效应大小和区域遗传力。使用英国生物银行(UKB)数据模拟基因组区域,我们证明,与BOLT中实现的方差成分模型或HESS中实现的固定效应模型相比,FINEMAP提供了更高的精度,并能够将区域遗传力更详细地分解为单个变异,特别是当精细定位区域中只有少数因果变异时。使用来自UKB研究的2940种血浆蛋白的数据,我们观察到,平均而言,FINEMAP在一个关联信号处识别出2.5个因果变异,并且比P值最低的变异多捕获36%的区域遗传力。我们估计,在对总遗传力有显著贡献的基因组区域中,FINEMAP分别比BOLT和HESS多捕获13%和40%的遗传力。我们的分析展示了在可以推断区域遗传结构的变异水平图景的情况下,FINEMAP、BOLT和HESS是如何相互关联的。