Schneider T D, Stormo G D
National Cancer Institute, Laboratory of Mathematical Biology, Frederick, MD 21701.
Nucleic Acids Res. 1989 Jan 25;17(2):659-74. doi: 10.1093/nar/17.2.659.
In our previous analysis of the information at binding sites on nucleic acids, we found that most of the sites examined contain the amount of information expected from their frequency in the genome. The sequences at bacteriophage T7 promoters are an exception, because they are far more conserved (35 bits of information content) than should be necessary to distinguish them from the background of the Escherichia coli genome (17 bits). To determine the information actually used by the T7 RNA polymerase, promoters were chemically synthesized with many variations and those that function well in an in vivo assay were sequenced. Our analysis shows that the polymerase uses 18 bits of information, so the sequences at phage genomic promoters have significantly more information than the polymerase needs. The excess may represent the binding site of another protein.
在我们之前对核酸结合位点信息的分析中,我们发现大多数被检测的位点所含信息量与其在基因组中的出现频率所预期的一致。噬菌体T7启动子的序列是个例外,因为它们比从大肠杆菌基因组背景中区分出来所需的保守程度要高得多(信息含量为35比特)。为了确定T7 RNA聚合酶实际使用的信息,我们化学合成了具有多种变异的启动子,并对那些在体内试验中功能良好的启动子进行了测序。我们的分析表明,该聚合酶使用18比特的信息,因此噬菌体基因组启动子处的序列所含信息明显多于聚合酶所需。多余的信息可能代表另一种蛋白质的结合位点。