Liu Bingchuan, Chen Jiajia, Shen Bairong
Center for Systems Biology, Soochow University, Suzhou, China.
BMC Syst Biol. 2011 May 4;5 Suppl 1(Suppl 1):S2. doi: 10.1186/1752-0509-5-S1-S2.
Bi-directional gene pairs have received considerable attention for their prevalence in vertebrate genomes. However, their biological relevance and exact regulatory mechanism remain less understood. To study the inner properties of this gene organization and the difference between bi- and uni-directional genes, we conducted a genome-wide investigation in terms of their sequence composition, functional association and regulatory motif discovery.
We identified 1210 bi-directional gene pairs based on the GRCh37 assembly data, accounting for 11.6% of all the human genes owning RNAs. CpG islands were detected in 98.42% of bi-directional promoters and 61.07% of unidirectional promoters. Functional enrichment analysis in GO and GeneGO both revealed that bi-directional genes tend to be associated with housekeeping functions in metabolism pathways and nuclear processes, and 46.84% of the pair members are involved in the same biological function. By fold-enrichment analysis, we characterized 73 and 43 putative transcription factor binding sites(TFBS) that preferentially occur in bi-directional promoters from TRANSFAC and JASPAR database respectively. By text mining, some of them were verified by individual experiments and several novel binding motifs were also identified.
Bi-directional promoters feature a significant enrichment of CpG-islands as well as a high GC content. We provided insight into the function constraints of bi-directional genes and found that paired genes are biased toward functional similarities. We hypothesized that the functional association underlies the co-expression of bi-directional genes. Furthermore, we proposed a set of putative regulatory motifs in the bi-directional promoters for further experimental studies to investigate transcriptional regulation of bi-directional genes.
双向基因对因其在脊椎动物基因组中的普遍性而受到广泛关注。然而,它们的生物学相关性和确切调控机制仍不太清楚。为了研究这种基因组织的内在特性以及双向和单向基因之间的差异,我们从序列组成、功能关联和调控基序发现等方面进行了全基因组研究。
基于GRCh37组装数据,我们鉴定出1210个双向基因对,占所有拥有RNA的人类基因的11.6%。在98.42%的双向启动子和61.07%的单向启动子中检测到CpG岛。GO和GeneGO中的功能富集分析均显示,双向基因倾向于与代谢途径和核过程中的管家功能相关,并且46.84%的基因对成员参与相同的生物学功能。通过富集倍数分析,我们分别从TRANSFAC和JASPAR数据库中鉴定出73个和43个优先出现在双向启动子中的假定转录因子结合位点(TFBS)。通过文本挖掘,其中一些通过个体实验得到验证,还鉴定出了几个新的结合基序。
双向启动子具有显著富集的CpG岛以及高GC含量。我们深入了解了双向基因的功能限制,发现配对基因倾向于功能相似性。我们推测功能关联是双向基因共表达的基础。此外,我们在双向启动子中提出了一组假定的调控基序,以供进一步的实验研究来探讨双向基因的转录调控。