Zhang Qingrun, Long Quan, Ott Jurg
Department of Genetics and Genomic Sciences, Institute of Genomics and Multi-scale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America.
Institute of Psychology, Chinese Academy of Sciences, Chaoyang District, Beijing, PR China; Laboratory of Statistical Genetics, The Rockefeller University, New York, New York, United States of America.
PLoS Comput Biol. 2014 Jun 5;10(6):e1003627. doi: 10.1371/journal.pcbi.1003627. eCollection 2014 Jun.
Identifying gene-gene interaction is a hot topic in genome wide association studies. Two fundamental challenges are: (1) how to smartly identify combinations of variants that may be associated with the trait from astronomical number of all possible combinations; and (2) how to test epistatic interaction when all potential combinations are available. We developed AprioriGWAS, which brings two innovations. (1) Based on Apriori, a successful method in field of Frequent Itemset Mining (FIM) in which a pattern growth strategy is leveraged to effectively and accurately reduce search space, AprioriGWAS can efficiently identify genetically associated genotype patterns. (2) To test the hypotheses of epistasis, we adopt a new conditional permutation procedure to obtain reliable statistical inference of Pearson's chi-square test for the [Formula: see text] contingency table generated by associated variants. By applying AprioriGWAS to age-related macular degeneration (AMD) data, we found that: (1) angiopoietin 1 (ANGPT1) and four retinal genes interact with Complement Factor H (CFH). (2) GO term "glycosaminoglycan biosynthetic process" was enriched in AMD interacting genes. The epistatic interactions newly found by AprioriGWAS on AMD data are likely true interactions, since genes interacting with CFH are retinal genes, and GO term enrichment also verified that interaction between glycosaminoglycans (GAGs) and CFH plays an important role in disease pathology of AMD. By applying AprioriGWAS on Bipolar disorder in WTCCC data, we found variants without marginal effect show significant interactions. For example, multiple-SNP genotype patterns inside gene GABRB2 and GRIA1 (AMPA subunit 1 receptor gene). AMPARs are found in many parts of the brain and are the most commonly found receptor in the nervous system. The GABRB2 mediates the fastest inhibitory synaptic transmission in the central nervous system. GRIA1 and GABRB2 are relevant to mental disorders supported by multiple evidences.
识别基因-基因相互作用是全基因组关联研究中的一个热门话题。两个基本挑战是:(1)如何从所有可能组合的天文数字中巧妙地识别出可能与该性状相关的变异组合;(2)当所有潜在组合都可用时,如何检验上位性相互作用。我们开发了AprioriGWAS,它带来了两项创新。(1)基于Apriori,这是频繁项集挖掘(FIM)领域中的一种成功方法,其中利用模式增长策略有效且准确地减少搜索空间,AprioriGWAS可以高效地识别与基因相关的基因型模式。(2)为了检验上位性假设,我们采用一种新的条件置换程序,以获得由相关变异生成的[公式:见正文]列联表的Pearson卡方检验的可靠统计推断。通过将AprioriGWAS应用于年龄相关性黄斑变性(AMD)数据,我们发现:(1)血管生成素1(ANGPT1)和四个视网膜基因与补体因子H(CFH)相互作用。(2)GO术语“糖胺聚糖生物合成过程”在AMD相互作用基因中富集。AprioriGWAS在AMD数据上新发现的上位性相互作用可能是真正的相互作用,因为与CFH相互作用的基因是视网膜基因,并且GO术语富集也证实了糖胺聚糖(GAGs)与CFH之间的相互作用在AMD的疾病病理学中起重要作用。通过将AprioriGWAS应用于WTCCC数据中的双相情感障碍,我们发现没有边际效应的变异显示出显著的相互作用。例如,基因GABRB2和GRIA1(AMPA亚基1受体基因)内的多个单核苷酸多态性(SNP)基因型模式。AMPA受体存在于大脑的许多部位,是神经系统中最常见的受体。GABRB2介导中枢神经系统中最快的抑制性突触传递。多项证据支持GRIA1和GABRB2与精神障碍有关。