Platig John, Castaldi Peter J, DeMeo Dawn, Quackenbush John
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America.
Department of Biostatistics, Harvard Chan School of Public Health, Boston, Massachusetts, United States of America.
PLoS Comput Biol. 2016 Sep 12;12(9):e1005033. doi: 10.1371/journal.pcbi.1005033. eCollection 2016 Sep.
Genome Wide Association Studies (GWAS) and expression quantitative trait locus (eQTL) analyses have identified genetic associations with a wide range of human phenotypes. However, many of these variants have weak effects and understanding their combined effect remains a challenge. One hypothesis is that multiple SNPs interact in complex networks to influence functional processes that ultimately lead to complex phenotypes, including disease states. Here we present CONDOR, a method that represents both cis- and trans-acting SNPs and the genes with which they are associated as a bipartite graph and then uses the modular structure of that graph to place SNPs into a functional context. In applying CONDOR to eQTLs in chronic obstructive pulmonary disease (COPD), we found the global network "hub" SNPs were devoid of disease associations through GWAS. However, the network was organized into 52 communities of SNPs and genes, many of which were enriched for genes in specific functional classes. We identified local hubs within each community ("core SNPs") and these were enriched for GWAS SNPs for COPD and many other diseases. These results speak to our intuition: rather than single SNPs influencing single genes, we see groups of SNPs associated with the expression of families of functionally related genes and that disease SNPs are associated with the perturbation of those functions. These methods are not limited in their application to COPD and can be used in the analysis of a wide variety of disease processes and other phenotypic traits.
全基因组关联研究(GWAS)和表达定量性状位点(eQTL)分析已经确定了与广泛人类表型的遗传关联。然而,这些变异中的许多效应较弱,理解它们的联合效应仍然是一个挑战。一种假设是,多个单核苷酸多态性(SNP)在复杂网络中相互作用,以影响最终导致复杂表型(包括疾病状态)的功能过程。在这里,我们介绍CONDOR,一种将顺式和反式作用的SNP以及它们所关联的基因表示为二分图,然后利用该图的模块结构将SNP置于功能背景中的方法。在将CONDOR应用于慢性阻塞性肺疾病(COPD)的eQTL时,我们发现通过GWAS,全局网络“枢纽”SNP没有疾病关联。然而,该网络被组织成52个SNP和基因群落,其中许多群落富含特定功能类别的基因。我们在每个群落中鉴定出局部枢纽(“核心SNP”),这些核心SNP富含COPD和许多其他疾病的GWAS SNP。这些结果符合我们的直觉:我们看到的不是单个SNP影响单个基因,而是与功能相关基因家族表达相关的SNP组,并且疾病SNP与这些功能的扰动相关。这些方法的应用不限于COPD,可用于分析各种疾病过程和其他表型特征。