Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA.
School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA.
Genet Epidemiol. 2023 Dec;47(8):585-599. doi: 10.1002/gepi.22535. Epub 2023 Aug 13.
We propose structural equation models (SEMs) as a general framework to infer causal networks for metabolites and other complex traits. Traditionally SEMs are used only for individual-level data under the assumption that all instrumental variables (IVs) are valid. To overcome these limitations, we propose both one- and two-sample approaches for causal network inference based on SEMs that can: (1) perform causal analysis and discover causal relationships among multiple traits; (2) account for the possible presence of some invalid IVs; (3) allow for data analysis using only genome-wide association studies (GWAS) summary statistics when individual-level data are not available; (4) consider the possibility of bidirectional relationships between traits. Our method employs a simple stepwise selection to identify invalid IVs, thus avoiding false positives while possibly increasing true discoveries based on two-stage least squares (2SLS). We use both real GWAS data and simulated data to demonstrate the superior performance of our method over the standard 2SLS/SEMs. For real data analysis, our proposed approach is applied to a human blood metabolite GWAS summary data set to uncover putative causal relationships among the metabolites; we also identify some metabolites (putative) causal to Alzheimer's disease (AD), which, along with the inferred causal metabolite network, suggest some possible pathways of metabolites involved in AD.
我们提出结构方程模型(SEMs)作为一种推断代谢物和其他复杂特征因果网络的通用框架。传统上,仅在假设所有工具变量(IVs)都有效的情况下,才将 SEMs 用于个体水平数据。为了克服这些限制,我们提出了基于 SEMs 的两种单样本和双样本方法来进行因果网络推断,这些方法可以:(1)进行因果分析并发现多个特征之间的因果关系;(2)可以考虑到一些无效 IVs 的存在;(3)当无法获得个体水平数据时,可以仅使用全基因组关联研究(GWAS)汇总统计信息进行数据分析;(4)考虑特征之间双向关系的可能性。我们的方法采用简单的逐步选择来识别无效 IVs,从而避免假阳性,同时可能根据两阶段最小二乘法(2SLS)增加真实发现。我们使用真实的 GWAS 数据和模拟数据来证明我们的方法优于标准的 2SLS/SEMs。对于真实数据分析,我们提出的方法应用于人类血液代谢物 GWAS 汇总数据集,以揭示代谢物之间的潜在因果关系;我们还确定了一些代谢物(假定)与阿尔茨海默病(AD)有关,这与推断出的因果代谢物网络一起,提示了 AD 中涉及的代谢物的一些可能途径。