Cheng Qing, Yang Yi, Shi Xingjie, Yeung Kar-Fu, Yang Can, Peng Heng, Liu Jin
Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore 169857, Singapore.
Department of Statistics, Nanjing University of Finance and Economics, Nanjing, 210023, China.
NAR Genom Bioinform. 2020 May 4;2(2):lqaa028. doi: 10.1093/nargab/lqaa028. eCollection 2020 Jun.
The proliferation of genome-wide association studies (GWAS) has prompted the use of two-sample Mendelian randomization (MR) with genetic variants as instrumental variables (IVs) for drawing reliable causal relationships between health risk factors and disease outcomes. However, the unique features of GWAS demand that MR methods account for both linkage disequilibrium (LD) and ubiquitously existing horizontal pleiotropy among complex traits, which is the phenomenon wherein a variant affects the outcome through mechanisms other than exclusively through the exposure. Therefore, statistical methods that fail to consider LD and horizontal pleiotropy can lead to biased estimates and false-positive causal relationships. To overcome these limitations, we proposed a probabilistic model for MR analysis in identifying the causal effects between risk factors and disease outcomes using GWAS summary statistics in the presence of LD and to properly account for horizontal pleiotropy among genetic variants (MR-LDP) and develop a computationally efficient algorithm to make the causal inference. We then conducted comprehensive simulation studies to demonstrate the advantages of MR-LDP over the existing methods. Moreover, we used two real exposure-outcome pairs to validate the results from MR-LDP compared with alternative methods, showing that our method is more efficient in using all-instrumental variants in LD. By further applying MR-LDP to lipid traits and body mass index (BMI) as risk factors for complex diseases, we identified multiple pairs of significant causal relationships, including a protective effect of high-density lipoprotein cholesterol on peripheral vascular disease and a positive causal effect of BMI on hemorrhoids.
全基因组关联研究(GWAS)的激增促使人们使用两样本孟德尔随机化(MR)方法,将基因变异作为工具变量(IVs),以得出健康风险因素与疾病结局之间可靠的因果关系。然而,GWAS的独特特征要求MR方法既要考虑连锁不平衡(LD),也要考虑复杂性状中普遍存在的水平多效性,即一个变异通过除仅通过暴露之外的其他机制影响结局的现象。因此,未能考虑LD和水平多效性的统计方法可能导致有偏差的估计和假阳性因果关系。为了克服这些局限性,我们提出了一种用于MR分析的概率模型,用于在存在LD的情况下使用GWAS汇总统计量识别风险因素与疾病结局之间的因果效应,并适当考虑基因变异之间的水平多效性(MR-LDP),并开发一种计算效率高的算法来进行因果推断。然后,我们进行了全面的模拟研究,以证明MR-LDP相对于现有方法的优势。此外,我们使用了两个真实的暴露-结局对来验证MR-LDP与其他方法相比的结果,表明我们的方法在使用LD中的所有工具变量方面更有效。通过进一步将MR-LDP应用于脂质性状和体重指数(BMI)作为复杂疾病的风险因素,我们确定了多对显著的因果关系,包括高密度脂蛋白胆固醇对周围血管疾病的保护作用以及BMI对内痔的正向因果作用。