Suppr超能文献

通过eQTL全基因组关联研究和蛋白质-蛋白质相互作用数据的综合分析推断基因与疾病的关联

Inferring Gene-Disease Association by an Integrative Analysis of eQTL Genome-Wide Association Study and Protein-Protein Interaction Data.

作者信息

Wang Jun, Zheng Jiashun, Wang Zengmiao, Li Hao, Deng Minghua

机构信息

Center for Quantitative Biology, Peking University, Beijing, China.

Department of Biochemistry and Biophysics, University of California, San Francisco, San Francisco, California, USA.

出版信息

Hum Hered. 2018;83(3):117-129. doi: 10.1159/000489761. Epub 2019 Jan 22.

Abstract

OBJECTIVES

Genome-wide association studies (GWASs) have revealed many candidate SNPs, but the mechanisms by which these SNPs influence diseases are largely unknown. In order to decipher the underlying mechanisms, several methods have been developed to predict disease-associated genes based on the integration of GWAS and eQTL data (e.g., Sherlock and COLOC). A number of studies have also incorporated information from gene networks into GWAS analysis to reprioritize candidate genes.

METHODS

Motivated by these two different approaches, we have developed a statistical framework to integrate information from GWAS, eQTL, and protein-protein interaction (PPI) data to predict disease-associated genes. Our approach is based on a hidden Markov random field (HMRF) model, and we called the resulting computational algorithm GeP-HMRF (a GWAS-eQTL-PPI-based HMRF).

RESULTS

We compared the performance of GeP-HMRF with Sherlock, COLOC, and NetWAS methods on 9 GWAS datasets, using the disease-related genes in the MalaCards database as the standard, and found that GeP-HMRF significantly improves the prediction accuracy. We also applied GeP-HMRF to an age-related macular degeneration disease (AMD) dataset. Among the top 50 genes predicted by GeP-HMRF, 7 are reported by the MalaCards database to be AMD-related with an enrichment p value of 3.61 × 10-119. Among the top 20 genes predicted by GeP-HMRF, CFHR1, CGHR3, HTRA1, and CFH are AMD-related in the MalaCards database, and another 9 genes are supported by the literature.

CONCLUSIONS

We built a unified statistical model to predict disease-related genes by integrating GWAS, eQTL, and PPI data. Our approach outperforms Sherlock, COLOC, and NetWAS in simulation studies and 9 GWAS datasets. Our approach can be generalized to incorporate other molecular trait data beyond eQTL and other interaction data beyond PPI.

摘要

目的

全基因组关联研究(GWAS)已揭示了许多候选单核苷酸多态性(SNP),但这些SNP影响疾病的机制大多未知。为了解析潜在机制,已开发了多种基于GWAS和表达数量性状位点(eQTL)数据整合来预测疾病相关基因的方法(例如,Sherlock和COLOC)。许多研究还将基因网络信息纳入GWAS分析,以重新排列候选基因的优先级。

方法

受这两种不同方法的启发,我们开发了一个统计框架,整合GWAS、eQTL和蛋白质-蛋白质相互作用(PPI)数据中的信息来预测疾病相关基因。我们的方法基于隐马尔可夫随机场(HMRF)模型,由此产生的计算算法我们称为GeP-HMRF(基于GWAS-eQTL-PPI的HMRF)。

结果

我们在9个GWAS数据集上,以MalaCards数据库中的疾病相关基因作为标准,将GeP-HMRF的性能与Sherlock、COLOC和NetWAS方法进行了比较,发现GeP-HMRF显著提高了预测准确性。我们还将GeP-HMRF应用于一个年龄相关性黄斑变性疾病(AMD)数据集。在GeP-HMRF预测的前50个基因中,MalaCards数据库报告有7个与AMD相关,富集p值为3.61×10-119。在GeP-HMRF预测的前20个基因中,CFHR1、CGHR3、HTRA1和CFH在MalaCards数据库中与AMD相关,另有9个基因得到文献支持。

结论

我们构建了一个统一的统计模型,通过整合GWAS、eQTL和PPI数据来预测疾病相关基因。我们的方法在模拟研究和9个GWAS数据集上优于Sherlock、COLOC和NetWAS。我们的方法可以推广到纳入eQTL之外的其他分子性状数据以及PPI之外的其他相互作用数据。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验