Frohlich D R, Wells M A
Department of Biochemistry, University of Arizona, Tucson 85721.
J Mol Evol. 1994 May;38(5):476-81. doi: 10.1007/BF00178847.
Patterns in codon usage were examined for the coding regions of the 23 known lepidopteran hemolymph proteins. Coding triplets are GC rich at the third position and a significant linear relationship between GC content of silent and nonsilent (replacement) sites was demonstrated. Intron GC content was significantly lower than in coding regions and no relationship between intron GC content and the same at silent and nonsilent sites was found. Though hemolymph proteins are all produced by the same tissue--fat body--significantly less bias was observed when all moth sequences were pooled than when sequences of the two major species were analyzed separately, as predicted by the genome hypothesis. In cases where no statistically significant bias was observed, polar or acidic/basic amino acids were almost exclusively involved. Calculation of codon adaptation indices (CAI) was of limited value in quantifying the degree of codon bias and probably reflects the complexity of multicellular-organism life cycles and the changing patterns of gene expression over different developmental stages.
对23种已知鳞翅目昆虫血淋巴蛋白的编码区进行了密码子使用模式研究。编码三联体在第三位富含GC,并且沉默位点和非沉默(替换)位点的GC含量之间存在显著的线性关系。内含子GC含量显著低于编码区,且未发现内含子GC含量与沉默和非沉默位点的GC含量之间存在关系。尽管血淋巴蛋白均由同一组织——脂肪体产生,但正如基因组假说所预测的那样,当将所有蛾类序列汇总时,观察到的偏差明显小于分别分析两个主要物种的序列时。在未观察到统计学显著偏差的情况下,几乎只涉及极性或酸性/碱性氨基酸。密码子适应指数(CAI)的计算在量化密码子偏差程度方面价值有限,可能反映了多细胞生物生命周期的复杂性以及不同发育阶段基因表达模式的变化。