Polak Paz, Domany Eytan
Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot, 76100, Israel.
BMC Genomics. 2006 Jun 1;7:133. doi: 10.1186/1471-2164-7-133.
The human genome contains over one million Alu repeat elements whose distribution is not uniform. While metabolism-related genes were shown to be enriched with Alu, in structural genes Alu elements are under-represented. Such observations led researchers to suggest that Alu elements were involved in gene regulation and were selected to be present in some genes and absent from others. This hypothesis is gaining strength due to findings that indicate involvement of Alu elements in a variety of functions; for example, Alu sequences were found to contain several functional transcription factor (TF) binding sites (BSs). We performed a search for new putative BSs on Alu elements, using a database of Position Specific Score Matrices (PSSMs). We searched consensus Alu sequences as well as specific Alu elements that appear on the 5 Kbp regions upstream to the transcription start site (TSS) of about 14000 genes.
We found that the upstream regions of the TSS are enriched with Alu elements, and the Alu consensus sequences contain dozens of putative BSs for TFs. Hence several TFs have Alu-associated BSs upstream of the TSS of many genes. For several TFs most of the putative BSs reside on Alu; a few of these were previously found and their association with Alu was also reported. In four cases the fact that the identified BSs resided on Alu went unnoticed, and we report this association for the first time. We found dozens of new putative BSs. Interestingly, many of the corresponding TFs are associated with early markers of development, even though the upstream regions of development-related genes are Alu-poor, compared with translational and protein biosynthesis related genes, which are Alu-rich. Finally, we found a correlation between the mouse B1 and human Alu densities within the corresponding upstream regions of orthologous genes.
We propose that evolution used transposable elements to insert TF binding motifs into promoter regions. We observed enrichment of biosynthesis genes with Alu-associated BSs of developmental TFs. Since development and cell proliferation (of which biosynthesis is an essential component) were proposed to be opposing processes, these TFs possibly play inhibitory roles, suppressing proliferation during differentiation.
人类基因组包含超过一百万个Alu重复元件,其分布并不均匀。虽然与代谢相关的基因显示富含Alu元件,但在结构基因中Alu元件的含量较少。这些观察结果促使研究人员提出,Alu元件参与基因调控,并被选择存在于某些基因中而不存在于其他基因中。由于发现Alu元件参与多种功能,这一假设正得到越来越多的支持;例如,发现Alu序列包含多个功能性转录因子(TF)结合位点(BS)。我们使用位置特异性得分矩阵(PSSM)数据库在Alu元件上搜索新的假定BS。我们搜索了共有Alu序列以及出现在约14000个基因转录起始位点(TSS)上游5kbp区域的特定Alu元件。
我们发现TSS的上游区域富含Alu元件,并且Alu共有序列包含数十个TF的假定BS。因此,几个TF在许多基因的TSS上游具有与Alu相关的BS。对于几个TF,大多数假定的BS位于Alu上;其中一些先前已被发现,并且它们与Alu的关联也有报道。在四种情况下,所鉴定的BS位于Alu上这一事实未被注意到,我们首次报道了这种关联。我们发现了数十个新的假定BS。有趣的是,许多相应的TF与发育的早期标志物相关,尽管与富含Alu的翻译和蛋白质生物合成相关基因相比,与发育相关基因的上游区域Alu含量较低。最后,我们发现直系同源基因相应上游区域内小鼠B1和人类Alu密度之间存在相关性。
我们提出进化利用转座元件将TF结合基序插入启动子区域。我们观察到生物合成基因富含发育TF的与Alu相关的BS。由于发育和细胞增殖(其中生物合成是一个重要组成部分)被认为是相反的过程,这些TF可能起抑制作用,在分化过程中抑制增殖。