Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
Am J Hum Genet. 2023 Jan 5;110(1):44-57. doi: 10.1016/j.ajhg.2022.12.002.
Integrative genetic association methods have shown great promise in post-GWAS (genome-wide association study) analyses, in which one of the most challenging tasks is identifying putative causal genes and uncovering molecular mechanisms of complex traits. Recent studies suggest that prevailing computational approaches, including transcriptome-wide association studies (TWASs) and colocalization analysis, are individually imperfect, but their joint usage can yield robust and powerful inference results. This paper presents INTACT, a computational framework to integrate probabilistic evidence from these distinct types of analyses and implicate putative causal genes. This procedure is flexible and can work with a wide range of existing integrative analysis approaches. It has the unique ability to quantify the uncertainty of implicated genes, enabling rigorous control of false-positive discoveries. Taking advantage of this highly desirable feature, we further propose an efficient algorithm, INTACT-GSE, for gene set enrichment analysis based on the integrated probabilistic evidence. We examine the proposed computational methods and illustrate their improved performance over the existing approaches through simulation studies. We apply the proposed methods to analyze the multi-tissue eQTL data from the GTEx project and eight large-scale complex- and molecular-trait GWAS datasets from multiple consortia and the UK Biobank. Overall, we find that the proposed methods markedly improve the existing putative gene implication methods and are particularly advantageous in evaluating and identifying key gene sets and biological pathways underlying complex traits.
综合遗传关联方法在 GWAS(全基因组关联研究)分析后显示出巨大的潜力,其中最具挑战性的任务之一是确定潜在的因果基因,并揭示复杂性状的分子机制。最近的研究表明,流行的计算方法,包括转录组关联研究(TWAS)和共定位分析,各自都不完美,但联合使用可以产生稳健而强大的推断结果。本文提出了 INTACT,这是一种综合这些不同类型分析的概率证据并暗示潜在因果基因的计算框架。该程序具有灵活性,可以与广泛的现有综合分析方法一起使用。它具有量化所涉及基因不确定性的独特能力,能够严格控制假阳性发现。利用这一非常理想的特性,我们进一步提出了一种有效的算法 INTACT-GSE,用于基于综合概率证据的基因集富集分析。我们检验了所提出的计算方法,并通过模拟研究说明了它们在现有方法上的改进性能。我们将所提出的方法应用于分析来自 GTEx 项目的多组织 eQTL 数据以及来自多个联盟和英国生物库的八个大型复杂和分子性状 GWAS 数据集。总的来说,我们发现所提出的方法显著提高了现有的潜在基因关联方法,特别有利于评估和识别复杂性状的关键基因集和生物途径。