Department of Statistics, The Pennsylvania State University, University Park, PA, 16802, USA.
Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, 16802, USA.
Nat Commun. 2021 May 14;12(1):2851. doi: 10.1038/s41467-021-22588-0.
Genome-wide association studies (GWAS) have cataloged many significant associations between genetic variants and complex traits. However, most of these findings have unclear biological significance, because they often have small effects and occur in non-coding regions. Integration of GWAS with gene regulatory networks addresses both issues by aggregating weak genetic signals within regulatory programs. Here we develop a Bayesian framework that integrates GWAS summary statistics with regulatory networks to infer genetic enrichments and associations simultaneously. Our method improves upon existing approaches by explicitly modeling network topology to assess enrichments, and by automatically leveraging enrichments to identify associations. Applying this method to 18 human traits and 38 regulatory networks shows that genetic signals of complex traits are often enriched in interconnections specific to trait-relevant cell types or tissues. Prioritizing variants within enriched networks identifies known and previously undescribed trait-associated genes revealing biological and therapeutic insights.
全基因组关联研究 (GWAS) 已经对遗传变异与复杂性状之间的许多显著关联进行了编目。然而,由于这些发现通常具有较小的影响且发生在非编码区域,因此它们的生物学意义并不明确。通过将 GWAS 与基因调控网络集成,可以通过在调控程序中聚合弱遗传信号来解决这两个问题。在这里,我们开发了一种贝叶斯框架,该框架将 GWAS 汇总统计信息与调控网络集成在一起,以便同时推断遗传富集和关联。我们的方法通过显式地对网络拓扑结构进行建模来评估富集,并且通过自动利用富集来识别关联,从而改进了现有方法。将这种方法应用于 18 个人类特征和 38 个调控网络表明,复杂特征的遗传信号通常在与特征相关的细胞类型或组织特有的相互连接中富集。在富集网络中对变体进行优先级排序,可以确定已知和以前未描述的与特征相关的基因,从而揭示生物学和治疗学方面的见解。