Florquin Kobe, Saeys Yvan, Degroeve Sven, Rouzé Pierre, Van de Peer Yves
Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology (VIB), Ghent University Technologiepark 927, B-9052 Ghent, Belgium.
Nucleic Acids Res. 2005 Jul 27;33(13):4255-64. doi: 10.1093/nar/gki737. Print 2005.
DNA encodes at least two independent levels of functional information. The first level is for encoding proteins and sequence targets for DNA-binding factors, while the second one is contained in the physical and structural properties of the DNA molecule itself. Although the physical and structural properties are ultimately determined by the nucleotide sequence itself, the cell exploits these properties in a way in which the sequence itself plays no role other than to support or facilitate certain spatial structures. In this work, we focus on these structural properties, comparing them between different organisms and assessing their ability to describe the core promoter. We prove the existence of distinct types of core promoters, based on a clustering of their structural profiles. These results indicate that the structural profiles are much conserved within plants (Arabidopsis and rice) and animals (human and mouse), but differ considerably between plants and animals. Furthermore, we demonstrate that these structural profiles can be an alternative way of describing the core promoter, in addition to more classical motif or IUPAC-based approaches. Using the structural profiles as discriminatory elements to separate promoter regions from non-promoter regions, reliable models can be built to identify core-promoter regions using a strictly computational approach.
DNA编码至少两个独立层次的功能信息。第一个层次用于编码蛋白质以及DNA结合因子的序列靶点,而第二个层次包含于DNA分子本身的物理和结构特性之中。尽管物理和结构特性最终由核苷酸序列本身决定,但细胞利用这些特性的方式中,序列本身除了支持或促进某些空间结构外并无其他作用。在这项工作中,我们聚焦于这些结构特性,在不同生物体之间进行比较,并评估它们描述核心启动子的能力。基于其结构特征的聚类,我们证明了不同类型核心启动子的存在。这些结果表明,结构特征在植物(拟南芥和水稻)和动物(人类和小鼠)中高度保守,但在植物和动物之间存在显著差异。此外,我们证明,除了更经典的基于基序或国际纯粹与应用化学联合会(IUPAC)的方法外,这些结构特征可以作为描述核心启动子的另一种方式。将结构特征用作区分元件以将启动子区域与非启动子区域分开,可以构建可靠的模型,使用严格的计算方法来识别核心启动子区域。