Zhang Xiangyu, Jiang Wei, Zhao Hongyu
Department of Biostatistics, School of Public Health, Yale University, New Haven, Connecticut, United States of America.
medRxiv. 2023 Oct 6:2023.10.03.23294486. doi: 10.1101/2023.10.03.23294486.
Genome-wide association studies (GWASs) have achieved remarkable success in associating thousands of genetic variants with complex traits. However, the presence of linkage disequilibrium (LD) makes it challenging to identify the causal variants. To address this critical gap from association to causation, many fine mapping methods have been proposed to assign well-calibrated probabilities of causality to candidate variants, taking into account the underlying LD pattern. In this manuscript, we introduce a statistical framework that incorporates expression quantitative trait locus (eQTL) information to fine mapping, built on the sum of single-effects (SuSiE) regression model. Our new method, SuSiE, connects two SuSiE models, one for eQTL analysis and one for genetic fine mapping. This is achieved by first computing the posterior inclusion probabilities (PIPs) from an eQTL-based SuSiE model with the expression level of the candidate gene as the phenotype. These calculated PIPs are then utilized as prior inclusion probabilities for risk variants in another SuSiE model for the trait of interest. By leveraging eQTL information, SuSiE enhances the power of detecting causal SNPs while reducing false positives and the average size of credible sets by prioritizing functional variants within the candidate region. The advantages of SuSiE over SuSiE are demonstrated by simulations and an application to a single-cell epigenomic study for Alzheimer's disease. We also demonstrate that eQTL information can be used by SuSiE to compensate for the power loss because of an inaccurate LD matrix.
全基因组关联研究(GWAS)在将数千种基因变异与复杂性状关联起来方面取得了显著成功。然而,连锁不平衡(LD)的存在使得识别因果变异具有挑战性。为了弥补从关联到因果关系这一关键差距,人们提出了许多精细定位方法,以考虑潜在的LD模式,为候选变异赋予经过良好校准的因果概率。在本论文中,我们介绍了一个统计框架,该框架基于单效应总和(SuSiE)回归模型,将表达数量性状基因座(eQTL)信息纳入精细定位。我们的新方法SuSiE连接了两个SuSiE模型,一个用于eQTL分析,一个用于基因精细定位。这是通过首先从基于eQTL的SuSiE模型中计算后验包含概率(PIP)来实现的,该模型以候选基因的表达水平作为表型。然后,这些计算出的PIP被用作另一个针对感兴趣性状的SuSiE模型中风险变异的先验包含概率。通过利用eQTL信息,SuSiE提高了检测因果单核苷酸多态性(SNP)的能力,同时通过在候选区域内优先考虑功能变异来减少假阳性和可信集的平均大小。通过模拟以及在阿尔茨海默病单细胞表观基因组研究中的应用,证明了SuSiE相对于SuSiE的优势。我们还证明,SuSiE可以利用eQTL信息来弥补由于不准确LD矩阵导致的功效损失。