Suppr超能文献

使用 R 推断两个性状之间的因果关系及其在转录组关联研究中的应用。

Inferring causal direction between two traits using R with application to transcriptome-wide association studies.

机构信息

Division of Biostatistics and Health Data Science, School of Public Health, University of Minnesota, Minneapolis, MN, USA.

Department of Biostatistics, City University of Hong Kong, Kowloon, Hong Kong.

出版信息

Am J Hum Genet. 2024 Aug 8;111(8):1782-1795. doi: 10.1016/j.ajhg.2024.06.013. Epub 2024 Jul 24.

Abstract

In Mendelian randomization, two single SNP-trait correlation-based methods have been developed to infer the causal direction between an exposure (e.g., a gene) and an outcome (e.g., a trait), called MR Steiger's method and its recent extension called Causal Direction-Ratio (CD-Ratio). Here we propose an approach based on R, the coefficient of determination, to combine information from multiple (possibly correlated) SNPs to simultaneously infer the presence and direction of a causal relationship between an exposure and an outcome. Our proposed method generalizes Steiger's method from using a single SNP to multiple SNPs as IVs. It is especially useful in transcriptome-wide association studies (TWASs) (and similar applications) with typically small sample sizes for gene expression (or another molecular trait) data, providing a more flexible and powerful approach to inferring causal directions. It can be applied to GWAS summary data with a reference panel. We also discuss the influence of invalid IVs and introduce a new approach called R2S to select and remove invalid IVs (if any) to enhance the robustness. We compared the performance of the proposed method with existing methods in simulations to demonstrate its advantages. We applied the methods to identify causal genes for high/low-density lipoprotein cholesterol (HDL/LDL) using the individual-level GTEx gene expression data and UK Biobank GWAS data. The proposed method was able to confirm some well-known causal genes while identifying some novel ones. Additionally, we illustrated an application of the proposed method to GWAS summary to infer causal relationships between HDL/LDL and stroke/coronary artery disease (CAD).

摘要

在孟德尔随机化中,已经开发了两种基于单 SNP-性状相关的方法来推断暴露(例如,基因)与结局(例如,性状)之间的因果方向,分别称为 MR Steiger 方法及其最近扩展的因果方向比(CD-Ratio)。在这里,我们提出了一种基于 R 语言、决定系数的方法,该方法可以结合多个(可能相关的)SNP 的信息,同时推断暴露与结局之间因果关系的存在和方向。我们提出的方法将 Steiger 方法从使用单个 SNP 扩展到将多个 SNP 用作 IV,从而可以推断因果方向。它在转录组关联研究(TWAS)(和类似的应用)中特别有用,因为这些研究通常基因表达(或其他分子性状)数据的样本量较小,提供了一种更灵活、更强大的推断因果方向的方法。它可以应用于具有参考面板的 GWAS 汇总数据。我们还讨论了无效 IV 的影响,并引入了一种新方法 R2S 来选择和删除无效 IV(如果有),以增强稳健性。我们在模拟中比较了所提出的方法与现有方法的性能,以证明其优势。我们应用这些方法来识别高/低密度脂蛋白胆固醇(HDL/LDL)的因果基因,使用个体水平的 GTEx 基因表达数据和英国生物库 GWAS 数据。所提出的方法能够确认一些已知的因果基因,同时识别一些新的因果基因。此外,我们还说明了所提出的方法在 GWAS 汇总中的应用,以推断 HDL/LDL 与中风/冠心病(CAD)之间的因果关系。

相似文献

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验