Center for Evolutionary Functional Genomics, The Biodesign Institute, Arizona State University, Tempe, AZ 85287-5301, USA.
Bioinformatics. 2009 Oct 1;25(19):2473-7. doi: 10.1093/bioinformatics/btp462. Epub 2009 Jul 24.
In functional genomics, it is frequently useful to correlate expression levels of genes to identify transcription factor binding sites (TFBS) via the presence of common sequence motifs. The underlying assumption is that co-expressed genes are more likely to contain shared TFBS and, thus, TFBS can be identified computationally. Indeed, gene pairs with a very high expression correlation show a significant excess of shared binding sites in yeast. We have tested this assumption in a more complex organism, Drosophila melanogaster, by using experimentally determined TFBS and microarray expression data. We have also examined the reverse relationship between the expression correlation and the extent of TFBS sharing.
Pairs of genes with shared TFBS show, on average, a higher degree of co-expression than those with no common TFBS in Drosophila. However, the reverse does not hold true: gene pairs with high expression correlations do not share significantly larger numbers of TFBS. Exception to this observation exists when comparing expression of genes from the earliest stages of embryonic development. Interestingly, semantic similarity between gene annotations (Biological Process) is much better associated with TFBS sharing, as compared to the expression correlation. We discuss these results in light of reverse engineering approaches to computationally predict regulatory sequences by using comparative genomics.
在功能基因组学中,经常需要将基因的表达水平相关联,以通过存在共同的序列基序来识别转录因子结合位点(TFBS)。其基本假设是共表达的基因更有可能包含共享的 TFBS,因此可以通过计算来识别 TFBS。事实上,具有非常高表达相关性的基因对在酵母中显示出显著过多的共享结合位点。我们通过使用实验确定的 TFBS 和微阵列表达数据在更复杂的生物体果蝇中测试了这一假设。我们还研究了表达相关性和 TFBS 共享程度之间的反向关系。
在果蝇中,与没有共同 TFBS 的基因对相比,具有共享 TFBS 的基因对平均具有更高程度的共表达。然而,反之则不然:具有高表达相关性的基因对并不共享显著更多数量的 TFBS。当比较胚胎发育早期基因的表达时,这种观察结果存在例外。有趣的是,与表达相关性相比,基因注释(生物过程)之间的语义相似性与 TFBS 共享的相关性更好。我们根据比较基因组学通过计算预测调控序列的反向工程方法讨论了这些结果。