Lo Ken Sin, Vadlamudi Swarooparani, Fogarty Marie P, Mohlke Karen L, Lettre Guillaume
Montreal Heart Institute, Montreal, Quebec, Canada.
Department of Genetics, University of North Carolina, Chapel Hill, NC, USA.
Genomics. 2014 Aug;104(2):105-12. doi: 10.1016/j.ygeno.2014.04.006. Epub 2014 Jul 2.
Characterization of the epigenome promises to yield the functional elements buried in the human genome sequence, thus helping to annotate non-coding DNA polymorphisms with regulatory functions. Here, we develop two novel strategies to combine epigenomic data with transcriptomic profiles in humans or mice to prioritize potential candidate SNPs associated with lipid levels by genome-wide association study (GWAS). First, after confirming that lipid-associated loci that are also expression quantitative trait loci (eQTL) in human livers are enriched for ENCODE regulatory marks in the human hepatocellular HepG2 cell line, we prioritize candidate SNPs based on the number of these marks that overlap the variant position. This method recognized the known SORT1 rs12740374 regulatory SNP associated with LDL-cholesterol, and highlighted candidate functional SNPs at 15 additional lipid loci. In the second strategy, we combine ENCODE chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq) data and liver expression datasets from knockout mice lacking specific transcription factors. This approach identified SNPs in specific transcription factor binding sites that are located near target genes of these transcription factors. We show that FOXA2 transcription factor binding sites are enriched at lipid-associated loci and experimentally validate that alleles of one such proxy SNP located near the FOXA2 target gene BIRC5 show allelic differences in FOXA2-DNA binding and enhancer activity. These methods can be used to generate testable hypotheses for many non-coding SNPs associated with complex diseases or traits.
表观基因组的特征分析有望揭示隐藏在人类基因组序列中的功能元件,从而有助于注释具有调控功能的非编码DNA多态性。在此,我们开发了两种新策略,将人类或小鼠的表观基因组数据与转录组谱相结合,通过全基因组关联研究(GWAS)对与血脂水平相关的潜在候选单核苷酸多态性(SNP)进行优先级排序。首先,在确认人类肝脏中也是表达数量性状基因座(eQTL)的脂质相关基因座在人肝细胞HepG2细胞系中富含ENCODE调控标记后,我们根据与变异位点重叠的这些标记的数量对候选SNP进行优先级排序。该方法识别出了已知的与低密度脂蛋白胆固醇相关的SORT1 rs12740374调控SNP,并突出了另外15个脂质基因座处的候选功能SNP。在第二种策略中,我们将ENCODE染色质免疫沉淀结合高通量DNA测序(ChIP-seq)数据与来自缺乏特定转录因子的基因敲除小鼠的肝脏表达数据集相结合。这种方法在特定转录因子结合位点中识别出位于这些转录因子靶基因附近的SNP。我们表明,FOXA2转录因子结合位点在脂质相关基因座处富集,并通过实验验证了位于FOXA2靶基因BIRC5附近的一个此类代理SNP的等位基因在FOXA2-DNA结合和增强子活性方面存在等位基因差异。这些方法可用于为许多与复杂疾病或性状相关的非编码SNP生成可检验的假设。