Li Wencheng, Zou Huan, Tao Meifeng
State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, China.
Antonie Van Leeuwenhoek. 2007 Nov;92(4):417-27. doi: 10.1007/s10482-007-9170-6. Epub 2007 Jun 12.
The mechanism of translation initiation is responsible for shaping the mRNA sequences downstream of the start codon. However, this region has not been systematically analyzed in prokaryotes. We used sequence logos and statistic methods to analyze the patterns of overrepresented sequences in this region for 125 species of bacteria and 23 species of archaea. The specific positions are compared to the first 33 amino acids in the proteins. At the 2nd amino acid position, Lys, Ser or Thr is highly overrepresented for 68% to 84% of the genomes examined and Ala is highly overrepresented for 57% of the genomes. Overrepresentation of Lys2 is negatively correlated with the G + C content and overrepresentation of Ser2 or Thr2 is positively correlated with the G + C content of genomes. Ile at the 4th to the 8th positions were found to be overrepresented for 91% of the genomes analyzed and this seemed to be conserved for both bacteria and archaea. Organisms growing at high temperatures have relatively low extent of nucleotides bias at 5' termini of open reading frames (ORFs). The extent of overrepresenting A and underrepresenting G at ORF 5' termini is reduced in thermophiles and hyperthermophiles for both archaea and bacteria.
翻译起始机制负责塑造起始密码子下游的mRNA序列。然而,该区域在原核生物中尚未得到系统分析。我们使用序列标识和统计方法分析了125种细菌和23种古菌该区域中过度富集序列的模式。将特定位置与蛋白质中的前33个氨基酸进行比较。在第2个氨基酸位置,在所检测的68%至84%的基因组中,赖氨酸、丝氨酸或苏氨酸高度富集,而在57%的基因组中丙氨酸高度富集。赖氨酸2的过度富集与G + C含量呈负相关,丝氨酸2或苏氨酸2的过度富集与基因组的G + C含量呈正相关。在所分析的91%的基因组中,第4至8位的异亮氨酸被发现过度富集,这在细菌和古菌中似乎都是保守的。在高温下生长的生物体在开放阅读框(ORF)5'末端的核苷酸偏向程度相对较低。对于古菌和细菌中的嗜热菌和超嗜热菌,ORF 5'末端A过度富集和G不足的程度都有所降低。