Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada.
PLoS One. 2012;7(1):e30158. doi: 10.1371/journal.pone.0030158. Epub 2012 Jan 17.
Transposable elements (TEs) are mobile DNA sequences found in the genomes of almost all species. By measuring the normalized coverage of TE sequences within genes, we identified sets of genes with conserved extremes of high/low TE density in the genomes of human, mouse and cow and denoted them as 'shared upper/lower outliers (SUOs/SLOs)'. By comparing these outlier genes to the genomic background, we show that a large proportion of SUOs are involved in metabolic pathways and tend to be mammal-specific, whereas many SLOs are related to developmental processes and have more ancient origins. Furthermore, the proportions of different types of TEs within human and mouse orthologous SUOs showed high similarity, even though most detectable TEs in these two genomes inserted after their divergence. Interestingly, our computational analysis of polymerase-II (Pol-II) occupancy at gene promoters in different mouse tissues showed that 60% of tissue-specific SUOs show strong Pol-II binding only in embryonic stem cells (ESCs), a proportion significantly higher than the genomic background (37%). In addition, our analysis of histone marks such as H3K4me3 and H3K27me3 in mouse ESCs also suggest a strong association between TE-rich genes and open-chromatin at promoters. Finally, two independent whole-transcriptome datasets show a positive association between TE density and gene expression level in ESCs. While this study focuses on genes with extreme TE densities, the above results clearly show that the probability of TE accumulation/fixation in mammalian genes is not random and is likely associated with different factors/gene properties and, most importantly, an association between the TE insertion/fixation rate and gene activity status in ES cells.
转座元件(TEs)是存在于几乎所有物种基因组中的可移动 DNA 序列。通过测量 TE 序列在基因内的归一化覆盖度,我们在人类、小鼠和牛的基因组中鉴定出了一组具有保守高/低 TE 密度极值的基因,并将其命名为“共享上/下异常值(SUO/SLO)”。通过将这些异常值基因与基因组背景进行比较,我们发现很大一部分 SUO 参与代谢途径,并且倾向于具有哺乳动物特异性,而许多 SLO 与发育过程有关,具有更古老的起源。此外,人类和小鼠同源 SUO 内不同类型 TE 的比例具有高度相似性,即使这两个基因组中的大多数可检测 TE 是在它们分化之后插入的。有趣的是,我们对不同小鼠组织中聚合酶-II(Pol-II)在基因启动子处的占有率进行的计算分析表明,60%的组织特异性 SUO 仅在胚胎干细胞(ESCs)中表现出强烈的 Pol-II 结合,这一比例明显高于基因组背景(37%)。此外,我们对小鼠 ESCs 中组蛋白标记如 H3K4me3 和 H3K27me3 的分析也表明,富含 TE 的基因与启动子处的开放染色质之间存在很强的关联。最后,两个独立的全转录组数据集表明,在 ESCs 中,TE 密度与基因表达水平之间存在正相关。虽然本研究侧重于具有极端 TE 密度的基因,但上述结果清楚地表明,TE 在哺乳动物基因中的积累/固定的概率不是随机的,并且很可能与不同的因素/基因特性相关,最重要的是,TE 插入/固定率与 ES 细胞中基因活性状态之间存在关联。