Dieterich Christoph, Rahmann Sven, Vingron Martin
Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
Bioinformatics. 2004 Aug 4;20 Suppl 1:i109-15. doi: 10.1093/bioinformatics/bth908.
Our understanding of how genes are regulated in a concerted fashion is still limited. Especially, complex phenomena like cell cycle regulation in multicellular organisms are poorly understood. Therefore, we investigated conserved predicted transcription factor binding sites (TFBSs) in man-mouse upstream regions of genes that can be associated to a particular cell cycle phase in HeLa cells. TFBSs were predicted from selected binding site motifs (represented by position weight matrices, PWMs) based on a statistical approach. A regulatory role for a transcription factor is more probable if its predicted TFBSs are enriched in upstream regions of genes, that are associated with a subset of cell cycle phases. We tested for this association by computing exact P-values for the observed phase distributions under the null distribution defined by the relative amount of conserved upstream sequence of genes per cell cycle phase. We considered non-exonic and 5'-untranslated region (5'-UTR) binding sites separately and corrected for multiple testing by taking the false discovery rate into account.
We identified 22 non-exonic and 11 5'-UTR significant PWM phase distributions although expecting one false discovery. Many of the corresponding transcription factors (e.g. members of the thyroid hormone/retinoid receptor subfamily) have already been associated with cell cycle regulation, proliferation and development. It appears that our method is a suitable tool for detecting putative cell cycle regulators in the realm of known human transcription factors.
Further details and supplementary data can be obtained from http://corg.molgen.mpg.de/cellcycle
我们对于基因如何以协同方式进行调控的理解仍然有限。特别是,多细胞生物体中诸如细胞周期调控等复杂现象还知之甚少。因此,我们研究了人源和鼠源基因上游区域中保守的预测转录因子结合位点(TFBSs),这些基因可能与HeLa细胞中的特定细胞周期阶段相关。基于统计方法,从选定的结合位点基序(由位置权重矩阵,即PWMs表示)预测TFBSs。如果预测的TFBSs在与细胞周期阶段子集相关的基因上游区域富集,则转录因子发挥调控作用的可能性更大。我们通过计算在每个细胞周期阶段基因保守上游序列相对量所定义的零分布下观察到的阶段分布的精确P值,来测试这种关联。我们分别考虑非外显子和5'非翻译区(5'-UTR)结合位点,并通过考虑错误发现率来校正多重检验。
我们鉴定出22个非外显子和11个5'-UTR显著的PWM阶段分布,尽管预期会有一个错误发现。许多相应的转录因子(例如甲状腺激素/类视黄醇受体亚家族成员)已经与细胞周期调控、增殖和发育相关。看来我们的方法是在已知人类转录因子领域中检测假定细胞周期调节因子的合适工具。