Wuitschick J D, Karrer K M
Department of Biology, Marquette University, Milwaukee, Wisconsin 53201-1881, USA.
J Eukaryot Microbiol. 1999 May-Jun;46(3):239-47. doi: 10.1111/j.1550-7408.1999.tb05120.x.
In recent years, the amount of molecular sequencing data from Tetrahymena thermophila has dramatically increased. We analyzed G + C content, codon usage, initiator codon context and stop codon sites in the extremely A + T rich genome of this ciliate. Average G + C content was 38% for protein coding regions, 21% for 5' non-coding sequences, 19% for 3' non-coding sequences, 15% for introns, 19% for micronuclear limited sequences and 17% for macronuclear retained sequences flanking micronuclear specific regions. The 75 available T. thermophila protein coding sequences favored codons ending in T and, where possible, avoided those with G in the third position. Highly expressed genes were relatively G + C-rich and exhibited an extremely biased pattern of codon usage while developmentally regulated genes were more A + T-rich and showed less codon usage bias. Regions immediately preceding Tetrahymena translation initiator codons were generally A-rich. For the 60 stop codons examined, the frequency of G in the end + 1 site was much higher than expected whereas C never occupied this position.
近年来,嗜热四膜虫的分子测序数据量急剧增加。我们分析了这种纤毛虫富含A+T的基因组中的G+C含量、密码子使用情况、起始密码子上下文以及终止密码子位点。蛋白质编码区的平均G+C含量为38%,5'非编码序列为21%,3'非编码序列为19%,内含子为15%,微核有限序列为19%,微核特异性区域侧翼的大核保留序列为17%。75条可用的嗜热四膜虫蛋白质编码序列偏好以T结尾的密码子,并尽可能避免第三位为G的密码子。高表达基因相对富含G+C,并表现出极端偏向的密码子使用模式,而发育调控基因则富含A+T,且密码子使用偏性较小。嗜热四膜虫翻译起始密码子之前的区域通常富含A。在所检测的60个终止密码子中,末端+1位点的G频率远高于预期,而C从未占据该位置。