McHardy Alice C, Pühler Alfred, Kalinowski Jörn, Meyer Folker
Zentrum für Genomforschung, Bielefeld, Germany.
Proteomics. 2004 Jan;4(1):46-58. doi: 10.1002/pmic.200300501.
Synonymous codon usage is a commonly used means for estimating gene expression levels of Escherichia coli genes and has also been used for predicting highly expressed genes for a number of prokaryotic genomes. By comparison of expression level-dependent features in codon usage with protein abundance data from two proteome studies of exponentially growing E. coli and Bacillus subtilis cells, we try to evaluate whether the implicit assumption of this approach can be confirmed with experimental data. Log-odds ratio scores are used to model differences in codon usage between highly expressed genes and genomic average. Using these, the strength and significance of expression level-dependent features in codon usage were determined for the genes of the Escherichia coli, Bacillus subtilis and Haemophilus influenzae genomes. The comparison of codon usage features with protein abundance data confirmed a relationship between these to be present, although exceptions to this, possibly related to functional context, were found. For species with expression level-dependent features in their codon usage, the applied methodology could be used to improve in silico simulations of the outcome of two-dimensional gel electrophoretic experiments.
同义密码子使用是一种常用的估算大肠杆菌基因表达水平的方法,也被用于预测多个原核生物基因组中的高表达基因。通过将密码子使用中与表达水平相关的特征与来自指数生长的大肠杆菌和枯草芽孢杆菌细胞的两项蛋白质组研究中的蛋白质丰度数据进行比较,我们试图评估这种方法的隐含假设是否能得到实验数据的证实。对数优势比分数用于模拟高表达基因与基因组平均水平之间密码子使用的差异。利用这些数据,确定了大肠杆菌、枯草芽孢杆菌和流感嗜血杆菌基因组中基因在密码子使用方面与表达水平相关特征的强度和显著性。密码子使用特征与蛋白质丰度数据的比较证实了两者之间存在关联,不过也发现了一些可能与功能背景相关的例外情况。对于密码子使用具有与表达水平相关特征的物种,所应用的方法可用于改进二维凝胶电泳实验结果的计算机模拟。