Chan Zuckerberg Initiative Foundation, Redwood City, CA 94065, United States.
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, United States.
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae590.
We introduce a novel framework BEATRICE to identify putative causal variants from GWAS statistics. Identifying causal variants is challenging due to their sparsity and high correlation in the nearby regions. To account for these challenges, we rely on a hierarchical Bayesian model that imposes a binary concrete prior on the set of causal variants. We derive a variational algorithm for this fine-mapping problem by minimizing the KL divergence between an approximate density and the posterior probability distribution of the causal configurations. Correspondingly, we use a deep neural network as an inference machine to estimate the parameters of our proposal distribution. Our stochastic optimization procedure allows us to sample from the space of causal configurations, which we use to compute the posterior inclusion probabilities and determine credible sets for each causal variant. We conduct a detailed simulation study to quantify the performance of our framework against two state-of-the-art baseline methods across different numbers of causal variants and noise paradigms, as defined by the relative genetic contributions of causal and noncausal variants.
We demonstrate that BEATRICE achieves uniformly better coverage with comparable power and set sizes, and that the performance gain increases with the number of causal variants. We also show the efficacy BEATRICE in finding causal variants from the GWAS study of Alzheimer's disease. In comparison to the baselines, only BEATRICE can successfully find the APOE ϵ2 allele, a commonly associated variant of Alzheimer's.
BEATRICE is available for download at https://github.com/sayangsep/Beatrice-Finemapping.
我们引入了一种新的框架 BEATRICE,用于从 GWAS 统计数据中识别可能的因果变体。由于因果变体在附近区域的稀疏性和高度相关性,识别因果变体具有挑战性。为了应对这些挑战,我们依赖于一个分层贝叶斯模型,该模型对因果变体集施加二进制具体先验。我们通过最小化近似密度与因果配置后验概率分布之间的 KL 散度,为这个精细映射问题导出了一个变分算法。相应地,我们使用深度神经网络作为推理机来估计我们建议分布的参数。我们的随机优化过程允许我们从因果配置空间中进行采样,我们使用该采样来计算后验包含概率,并确定每个因果变体的可信集。我们进行了详细的模拟研究,以量化我们的框架在不同数量的因果变体和噪声范式下(由因果和非因果变体的相对遗传贡献定义)与两个最先进的基线方法相比的性能。
我们证明 BEATRICE 实现了均匀更好的覆盖范围,具有可比的功效和集大小,并且性能增益随着因果变体数量的增加而增加。我们还展示了 BEATRICE 在从阿尔茨海默病的 GWAS 研究中找到因果变体方面的功效。与基线相比,只有 BEATRICE 能够成功找到 APOE ϵ2 等位基因,这是阿尔茨海默病的常见关联变体。
BEATRICE 可在 https://github.com/sayangsep/Beatrice-Finemapping 下载。