Chen Jingqi, Tian Weidong
State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China.
State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai 200436, P.R. China
Nucleic Acids Res. 2016 Oct 14;44(18):8641-8654. doi: 10.1093/nar/gkw519. Epub 2016 Jun 8.
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes.
数千个疾病相关单核苷酸多态性(daSNP)位于基因间区域(IGR),这使得理解它们与疾病表型的关联变得困难。最近的分析发现,非编码daSNP经常位于调控元件中或其附近,这促使我们尝试通过附近的调控序列来解释IGR daSNP的疾病表型。因此,在确定给定的IGR daSNP最近的远端调控元件(DRE)后,我们应用一种名为INTREPID的计算方法来预测由DRE调控的靶基因,然后研究它们与IGR daSNP疾病表型的功能相关性。在所有研究的IGR daSNP-疾病表型关联中,36.8%可能通过预测的靶基因得到解释,这些靶基因与相应的疾病基因富集、功能相关或由其组成。如果也考虑daSNP的连锁不平衡(LD)单核苷酸多态性,这一比例可进一步提高到60.5%。此外,预测的单核苷酸多态性-靶基因对富含已知的表达数量性状基因座(eQTL)/甲基化数量性状基因座(mQTL)单核苷酸多态性-基因关系。总体而言,IGR daSNP可能通过干扰其附近DRE的调控功能并导致疾病基因异常表达来促成疾病表型。