Bucher P, Trifonov E N
Nucleic Acids Res. 1986 Dec 22;14(24):10009-26. doi: 10.1093/nar/14.24.10009.
A representative set of 168 eukaryotic POL II promoters has been compiled from the EMBL library and subjected to computer signal search analysis. Application of this technique to E. coli promoters as a control ensemble revealed the well known consensus sequences at -35 and -10 which indicates that the methods are adequate to approach problems of this kind. The results obtained from the eukaryotic promoter set can be summarized as follows: Common sequence features are confined to a region between -50 and +10 relative to the transcriptional initiation site. The only well conserved consensus sequence is TATAAA, centered at -28. A weak motif, CA followed preferentially by pyrimidines, surrounds the cap-site. Two pentanucleotides which have been shown by experiments to stimulate transcription of certain genes, GGGCG and CCAAT, are moderately over-represented in the upstream region (between -129 and -50). However, they occur at highly variable distances from the initiation site.
从EMBL文库中收集了一组具有代表性的168个真核生物POL II启动子,并对其进行了计算机信号搜索分析。将该技术应用于大肠杆菌启动子作为对照样本,揭示了位于-35和-10处的众所周知的共有序列,这表明该方法足以解决此类问题。从真核生物启动子组获得的结果可总结如下:常见的序列特征局限于相对于转录起始位点-50至+10的区域。唯一高度保守的共有序列是TATAAA,位于-28处。一个弱基序,CA后优先跟嘧啶,围绕着帽位点。实验表明能刺激某些基因转录的两个五核苷酸GGGCG和CCAAT,在上游区域(-129至-50之间)有适度的高频率出现。然而,它们与起始位点的距离变化很大。