Institute of Genetic Medicine, Newcastle University, Newcastle, United Kingdom.
MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom.
PLoS Genet. 2020 Mar 2;16(3):e1008198. doi: 10.1371/journal.pgen.1008198. eCollection 2020 Mar.
Mendelian randomization (MR) implemented through instrumental variables analysis is an increasingly popular causal inference tool used in genetic epidemiology. But it can have limitations for evaluating simultaneous causal relationships in complex data sets that include, for example, multiple genetic predictors and multiple potential risk factors associated with the same genetic variant. Here we use real and simulated data to investigate Bayesian network analysis (BN) with the incorporation of directed arcs, representing genetic anchors, as an alternative approach. A Bayesian network describes the conditional dependencies/independencies of variables using a graphical model (a directed acyclic graph) with an accompanying joint probability. In real data, we found BN could be used to infer simultaneous causal relationships that confirmed the individual causal relationships suggested by bi-directional MR, while allowing for the existence of potential horizontal pleiotropy (that would violate MR assumptions). In simulated data, BN with two directional anchors (mimicking genetic instruments) had greater power for a fixed type 1 error than bi-directional MR, while BN with a single directional anchor performed better than or as well as bi-directional MR. Both BN and MR could be adversely affected by violations of their underlying assumptions (such as genetic confounding due to unmeasured horizontal pleiotropy). BN with no directional anchor generated inference that was no better than by chance, emphasizing the importance of directional anchors in BN (as in MR). Under highly pleiotropic simulated scenarios, BN outperformed both MR (and its recent extensions) and two recently-proposed alternative approaches: a multi-SNP mediation intersection-union test (SMUT) and a latent causal variable (LCV) test. We conclude that BN incorporating genetic anchors is a useful complementary method to conventional MR for exploring causal relationships in complex data sets such as those generated from modern "omics" technologies.
孟德尔随机化(MR)通过工具变量分析实现,是遗传流行病学中越来越流行的因果推理工具。但是,当数据集中存在多个与同一遗传变异相关的遗传预测因子和潜在风险因素时,它可能会对评估复杂数据集中的同时因果关系产生限制。在这里,我们使用真实和模拟数据来研究贝叶斯网络分析(BN),并将包含有向弧(代表遗传锚点)的 BN 作为替代方法。贝叶斯网络使用图形模型(有向无环图)和伴随的联合概率来描述变量之间的条件依赖/独立性。在真实数据中,我们发现 BN 可用于推断同时的因果关系,这些因果关系证实了双向 MR 所暗示的个体因果关系,同时允许存在潜在的水平多效性(这将违反 MR 假设)。在模拟数据中,具有两个定向锚点(模拟遗传工具)的 BN 在固定的第一类错误率下比双向 MR 具有更高的功效,而具有单个定向锚点的 BN 则比双向 MR 表现更好或一样好。BN 和 MR 都可能受到违反其基本假设的影响(例如,由于未测量的水平多效性导致的遗传混杂)。没有定向锚点的 BN 生成的推断并不比随机推断好,这强调了定向锚点在 BN(与 MR 一样)中的重要性。在高度多效性的模拟场景下,BN 优于 MR(及其最近的扩展)和两种最近提出的替代方法:多 SNP 中介交叉-并集检验(SMUT)和潜在因果变量(LCV)检验。我们得出结论,包含遗传锚点的 BN 是一种有用的补充方法,可用于探索复杂数据集(如来自现代“组学”技术的数据集)中的因果关系,而这些数据集是传统 MR 无法处理的。