文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

利用无效工具变量和全基因组关联研究(GWAS)汇总数据进行转录组全关联研究中的因果推断

Causal Inference in Transcriptome-Wide Association Studies with Invalid Instruments and GWAS Summary Data.

作者信息

Xue Haoran, Shen Xiaotong, Pan Wei

机构信息

School of Statistics, University of Minnesota, Minneapolis, Minnesota 55455.

Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55455.

出版信息

J Am Stat Assoc. 2023;118(543):1525-1537. doi: 10.1080/01621459.2023.2183127. Epub 2023 Mar 17.


DOI:10.1080/01621459.2023.2183127
PMID:37808547
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10557939/
Abstract

Transcriptome-wide association studies (TWAS) have recently emerged as a popular tool to discover (putative) causal genes by integrating an outcome GWAS dataset with another gene expression/transcriptome GWAS (called eQTL) dataset. In our motivating and target application, we'd like to identify causal genes for low-density lipoprotein cholesterol (LDL), which is crucial for developing new treatments for hyperlipidemia and cardiovascular diseases. The statistical principle underlying TWAS is (two-sample) two-stage least squares (2SLS) using multiple correlated SNPs as instrumental variables (IVs); it is closely related to typical (two-sample) Mendelian randomization (MR) using independent SNPs as IVs, which is expected to be impractical and lower-powered for TWAS (and some other) applications. However, often some of the SNPs used may not be valid IVs, e.g. due to the widespread pleiotropy of their direct effects on the outcome not mediated through the gene of interest, leading to false conclusions by TWAS (or MR). Building on recent advances in sparse regression, we propose a robust and efficient inferential method to account for both hidden confounding and some invalid IVs via two-stage constrained maximum likelihood (2ScML), an extension of 2SLS. We first develop the proposed method with individual-level data, then extend it both theoretically and computationally to GWAS summary data for the most popular two-sample TWAS design, to which almost all existing robust IV regression methods are however not applicable. We show that the proposed method achieves asymptotically valid statistical inference on causal effects, demonstrating its wider applicability and superior finite-sample performance over the standard 2SLS/TWAS (and MR). We apply the methods to identify putative causal genes for LDL by integrating large-scale lipid GWAS summary data with eQTL data.

摘要

全转录组关联研究(TWAS)最近已成为一种流行的工具,通过将结果全基因组关联研究(GWAS)数据集与另一个基因表达/转录组GWAS(称为表达定量性状位点,eQTL)数据集相结合来发现(假定的)因果基因。在我们的激励性和目标应用中,我们希望识别与低密度脂蛋白胆固醇(LDL)相关的因果基因,这对于开发高脂血症和心血管疾病的新治疗方法至关重要。TWAS背后的统计原理是使用多个相关单核苷酸多态性(SNP)作为工具变量(IV)的(两样本)两阶段最小二乘法(2SLS);它与使用独立SNP作为IV的典型(两样本)孟德尔随机化(MR)密切相关,预计对于TWAS(以及其他一些)应用而言,这种方法不切实际且功效较低。然而,通常所使用的一些SNP可能不是有效的IV,例如,由于它们对结果的直接影响广泛存在多效性,并非通过感兴趣的基因介导,这会导致TWAS(或MR)得出错误结论。基于稀疏回归的最新进展,我们提出了一种稳健且有效的推断方法,通过两阶段约束最大似然法(2ScML)来解决隐藏的混杂因素和一些无效IV的问题,2ScML是2SLS的扩展。我们首先使用个体水平数据开发所提出的方法,然后在理论和计算上对其进行扩展,以适用于最流行的两样本TWAS设计的GWAS汇总数据,然而几乎所有现有的稳健IV回归方法都不适用于此。我们表明,所提出的方法在因果效应方面实现了渐近有效的统计推断,证明了其比标准2SLS/TWAS(和MR)具有更广泛的适用性和优越的有限样本性能。我们应用这些方法,通过整合大规模脂质GWAS汇总数据和eQTL数据来识别LDL的假定因果基因。

相似文献

[1]
Causal Inference in Transcriptome-Wide Association Studies with Invalid Instruments and GWAS Summary Data.

J Am Stat Assoc. 2023

[2]
Some statistical consideration in transcriptome-wide association studies.

Genet Epidemiol. 2019-12-10

[3]
Model checking via testing for direct effects in Mendelian Randomization and transcriptome-wide association studies.

PLoS Comput Biol. 2021-8

[4]
A robust two-sample transcriptome-wide Mendelian randomization method integrating GWAS with multi-tissue eQTL summary statistics.

Genet Epidemiol. 2021-6

[5]
Statistical power of transcriptome-wide association studies.

Genet Epidemiol. 2022-12

[6]
Inferring causal direction between two traits using R with application to transcriptome-wide association studies.

Am J Hum Genet. 2024-8-8

[7]
DeLIVR: a deep learning approach to IV regression for testing nonlinear causal effects in transcriptome-wide association studies.

Biostatistics. 2024-4-15

[8]
Inference of causal metabolite networks in the presence of invalid instrumental variables with GWAS summary data.

Genet Epidemiol. 2023-12

[9]
Inferring causal direction between two traits in the presence of horizontal pleiotropy with GWAS summary data.

PLoS Genet. 2020-11

[10]
Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects.

Am J Hum Genet. 2021-7-1

引用本文的文献

[1]
A Genetics-guided Integrative Framework for Drug Repurposing: Identifying Anti-hypertensive Drug Telmisartan for Type 2 Diabetes.

medRxiv. 2025-3-23

[2]
Multivariate proteome-wide association study to identify causal proteins for Alzheimer disease.

Am J Hum Genet. 2025-2-6

[3]
Co-expression-wide association studies link genetically regulated interactions with complex traits.

medRxiv. 2024-12-13

[4]
Identification of proteins associated with type 2 diabetes risk in diverse racial and ethnic populations.

Diabetologia. 2024-12

[5]
The goldmine of GWAS summary statistics: a systematic review of methods and tools.

BioData Min. 2024-9-5

[6]
Inferring causal direction between two traits using R with application to transcriptome-wide association studies.

Am J Hum Genet. 2024-8-8

[7]
A robust cis-Mendelian randomization method with application to drug target discovery.

Nat Commun. 2024-7-18

[8]
MIMOSA: a resource consisting of improved methylome prediction models increases power to identify DNA methylation-phenotype associations.

Epigenetics. 2024-12

[9]
Splicing-specific transcriptome-wide association uncovers genetic mechanisms for schizophrenia.

Am J Hum Genet. 2024-8-8

[10]
Causal relationship between circulating cytokines and follicular lymphoma: a two-sample Mendelian randomization study.

Am J Cancer Res. 2024-4-15

本文引用的文献

[1]
Constrained maximum likelihood-based Mendelian randomization robust to both correlated and uncorrelated pleiotropic effects.

Am J Hum Genet. 2021-7-1

[2]
Weak-instrument robust tests in two-sample summary-data Mendelian randomization.

Biometrics. 2022-12

[3]
KEGG: integrating viruses and cellular organisms.

Nucleic Acids Res. 2021-1-8

[4]
A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis.

Nat Genet. 2020-10-5

[5]
A comparison of robust Mendelian randomization methods using summary data.

Genet Epidemiol. 2020-4-6

[6]
IGREX for quantifying the impact of genetically regulated expression on phenotypes.

NAR Genom Bioinform. 2020-3

[7]
A robust and efficient method for Mendelian randomization with hundreds of genetic variants.

Nat Commun. 2020-1-17

[8]
A powerful fine-mapping method for transcriptome-wide association studies.

Hum Genet. 2019-12-16

[9]
On the Use of the Lasso for Instrumental Variables Estimation with Some Invalid Instruments.

J Am Stat Assoc. 2018-11-13

[10]
Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects.

Nat Commun. 2019-4-26

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索