Center for Biomedical Informatics, The University of Chicago, Chicago, Illinois, USA.
J Am Med Inform Assoc. 2013 Jul-Aug;20(4):619-29. doi: 10.1136/amiajnl-2012-001519. Epub 2013 Jan 25.
While genome-wide association studies (GWAS) of complex traits have revealed thousands of reproducible genetic associations to date, these loci collectively confer very little of the heritability of their respective diseases and, in general, have contributed little to our understanding the underlying disease biology. Physical protein interactions have been utilized to increase our understanding of human Mendelian disease loci but have yet to be fully exploited for complex traits.
We hypothesized that protein interaction modeling of GWAS findings could highlight important disease-associated loci and unveil the role of their network topology in the genetic architecture of diseases with complex inheritance.
Network modeling of proteins associated with the intragenic single nucleotide polymorphisms of the National Human Genome Research Institute catalog of complex trait GWAS revealed that complex trait associated loci are more likely to be hub and bottleneck genes in available, albeit incomplete, networks (OR=1.59, Fisher's exact test p < 2.24 × 10(-12)). Network modeling also prioritized novel type 2 diabetes (T2D) genetic variations from the Finland-USA Investigation of Non-Insulin-Dependent Diabetes Mellitus Genetics and the Wellcome Trust GWAS data, and demonstrated the enrichment of hubs and bottlenecks in prioritized T2D GWAS genes. The potential biological relevance of the T2D hub and bottleneck genes was revealed by their increased number of first degree protein interactions with known T2D genes according to several independent sources (p<0.01, probability of being first interactors of known T2D genes).
Virtually all common diseases are complex human traits, and thus the topological centrality in protein networks of complex trait genes has implications in genetics, personal genomics, and therapy.
尽管全基因组关联研究(GWAS)已经揭示了数千个可重复的复杂性状遗传关联,但这些基因座加在一起仅能解释其相应疾病的一小部分遗传率,并且通常对我们理解疾病的生物学基础贡献甚微。物理蛋白质相互作用已被用于增进我们对人类孟德尔疾病基因座的了解,但尚未充分应用于复杂性状。
我们假设对 GWAS 结果进行蛋白质相互作用建模可以突出重要的疾病相关基因座,并揭示其网络拓扑结构在具有复杂遗传的疾病遗传结构中的作用。
对与国家人类基因组研究所复杂性状 GWAS 中内含子单核苷酸多态性相关的蛋白质进行网络建模表明,复杂性状相关基因座更有可能成为现有(尽管不完整)网络中的枢纽和瓶颈基因(OR=1.59,Fisher 精确检验 p<2.24×10(-12))。网络建模还对来自芬兰-美国非胰岛素依赖型糖尿病遗传学研究和惠康信托 GWAS 数据的新型 2 型糖尿病(T2D)遗传变异进行了优先级排序,并证明了枢纽和瓶颈在优先 T2D GWAS 基因中的富集。T2D 枢纽和瓶颈基因的潜在生物学相关性通过它们与已知 T2D 基因的第一级蛋白质相互作用数量的增加而显现出来,根据几个独立来源,这是显著的(p<0.01,已知 T2D 基因的第一相互作用者的概率)。
几乎所有常见疾病都是复杂的人类特征,因此复杂性状基因在蛋白质网络中的拓扑中心性在遗传学、个人基因组学和治疗学中具有重要意义。