Wang Junwen, Hannenhalli Sridhar
Penn Center for Bioinformatics and Department of Genetics, University of Pennsylvania, Philadelphia, 19104-6021, USA.
Biochem Biophys Res Commun. 2006 Aug 18;347(1):166-77. doi: 10.1016/j.bbrc.2006.06.062. Epub 2006 Jun 21.
An accurate identification of gene promoters remains an important challenge. Computational approaches for this problem rely on promoter sequence attributes that are believed to be critical for transcription initiation. Here we report a probabilistic model that captures two important properties of promoters, not used by previous methods, viz., the location preference and co-occurrence of promoter elements. Additionally, we found that many of the position-specific DNA elements are strongly linked with the function of the gene product. For instance, a highly conserved motif CCTTT at -1 position is strongly associated with protein synthesis, cellular and tissue development. Our comparative analysis of promoter classes reveals that the promoters devoid of CpG islands are more conserved and have fewer alternative transcription start sites. The discovered links between promoter elements and gene function allows us to infer genetic networks from promoter elements. The web server for the PSPA promoter predictor is available at /PSPA.
准确识别基因启动子仍然是一项重大挑战。针对这个问题的计算方法依赖于被认为对转录起始至关重要的启动子序列属性。在此,我们报告一种概率模型,该模型捕捉到了启动子的两个重要特性,而这是先前方法未使用的,即启动子元件的位置偏好和共现。此外,我们发现许多位置特异性DNA元件与基因产物的功能紧密相连。例如,位于 -1 位置的高度保守基序 CCTTT 与蛋白质合成、细胞和组织发育密切相关。我们对启动子类别的比较分析表明,缺乏 CpG 岛的启动子更为保守,且具有较少的可变转录起始位点。启动子元件与基因功能之间已发现的联系使我们能够从启动子元件推断遗传网络。PSPA 启动子预测器的网络服务器可在 /PSPA 获取。