Liu Xuanshi, Xu Wenjian, Leng Fei, Hao Chanjuan, Kolora Sree Rohit Raj, Li Wei
Beijing Key Laboratory for Genetics of Birth Defects, Beijing Pediatric Research Institute, Beijing, China.
MOE Key Laboratory of Major Diseases in Children, Beijing, China.
Comput Struct Biotechnol J. 2020 Oct 21;18:2945-2952. doi: 10.1016/j.csbj.2020.10.014. eCollection 2020.
Genome-wide association studies (GWAS) have contributed significantly to predisposing the disease etiology by associating single nucleotide polymorphisms (SNPs) with complex diseases. However, most GWAS-SNPs are in the noncoding regions that may affect distal genes via long range enhancer-promoter interactions. Thus, the common practice on GWAS discoveries cannot fully reveal the molecular mechanisms underpinning complex diseases. It is known that perturbations of topological associated domains (TADs) lead to long range interactions which underlie disease etiology. To identify the probable long range interactions in noncoding regions via GWAS and TADs perturbed by deletions, we integrated datasets from GWAS-SNPs, enhancers, TADs, and deletions. After ranking and clustering, we prioritized 201,132 high confident pairs of GWAS-SNPs and target genes. In this study, we performed a systematic inference on noncoding regions via GWAS-SNPs and deletion-perturbed TADs to boost GWAS discovery power. The high confident pairs of GWAS-SNPs and target genes (SE-Gs) provide the promising candidates to understand the molecular mechanisms underlying complex diseases with emphasis on the three-dimensional genome.
全基因组关联研究(GWAS)通过将单核苷酸多态性(SNP)与复杂疾病相关联,在揭示疾病病因方面做出了重大贡献。然而,大多数GWAS-SNP位于非编码区域,这些区域可能通过长程增强子-启动子相互作用影响远端基因。因此,GWAS发现的常规做法无法完全揭示复杂疾病背后的分子机制。已知拓扑相关结构域(TAD)的扰动会导致长程相互作用,而这正是疾病病因的基础。为了通过GWAS和因缺失而受到扰动的TAD来识别非编码区域中可能的长程相互作用,我们整合了来自GWAS-SNP、增强子、TAD和缺失的数据集。经过排序和聚类后,我们对201,132对高可信度的GWAS-SNP和靶基因进行了优先排序。在本研究中,我们通过GWAS-SNP和缺失扰动的TAD对非编码区域进行了系统推断,以提高GWAS的发现能力。高可信度的GWAS-SNP和靶基因对(SE-Gs)为理解复杂疾病背后的分子机制提供了有前景的候选对象,尤其侧重于三维基因组。