Interdisciplinary Program, Bioengineering Major, Graduate School, Seoul National University, Seoul 151-742, Republic of Korea.
Department of Bio and Brain Engineering, KAIST, Daejeon 34141, Republic of Korea.
Genomics. 2020 Mar;112(2):1208-1213. doi: 10.1016/j.ygeno.2019.07.006. Epub 2019 Jul 8.
Interpretation of noncoding disease variants, which comprise the vast majority of Genome-wide association studies (GWAS) hits, remains a momentous challenge due to haplotype structure and our limited understanding of the mechanisms and physiological contexts of noncoding elements. GWAS have identified loci underlying human diseases, but assigning the causal nucleotide changes still remain a controversial issue. Here we addressed these issues through the combination of high-density genotyping and epigenomic data using a random forest model to discover the noncoding causal variants. Focusing on autoimmune diseases, we triaged putative causal variants for atopic dermatitis and inflammatory bowel diseases. Making a filtering pipeline, we found three interesting single nucleotide polymorphisms (rs1800630, rs1799964 and rs4796793) in the upstream site of TNF and STAT3 genes, two frequent genes shared in some autoimmune diseases, and show how those variants affect on TNF and STAT3 expression levels. All data and source codes related to this manuscript are available at https://github.com/jieunjung511/Autoimmune-research.
由于单倍型结构以及我们对非编码元件的机制和生理背景的理解有限,非编码疾病变异(占全基因组关联研究 [GWAS] 绝大多数)的解释仍然是一个重大挑战。GWAS 已经确定了人类疾病的相关基因座,但确定因果核苷酸变化仍然是一个有争议的问题。在这里,我们通过使用随机森林模型结合高密度基因分型和表观基因组数据来解决这些问题,以发现非编码因果变异。我们专注于自身免疫性疾病,对特应性皮炎和炎症性肠病的潜在因果变异进行了分类。通过建立一个筛选管道,我们在 TNF 和 STAT3 基因的上游位点发现了三个有趣的单核苷酸多态性(rs1800630、rs1799964 和 rs4796793),这两个基因是一些自身免疫性疾病中常见的基因,并展示了这些变异如何影响 TNF 和 STAT3 的表达水平。与本手稿相关的所有数据和源代码均可在 https://github.com/jieunjung511/Autoimmune-research 上获得。