Carbone A, Képès F, Zinovyev A
Génomique Analytique, Université Pierre et Marie Curie, INSERM U511, 91, Bd de l'Hôpital, 75013 Paris, France.
Mol Biol Evol. 2005 Mar;22(3):547-61. doi: 10.1093/molbev/msi040. Epub 2004 Nov 10.
New and simple numerical criteria based on a codon adaptation index are applied to the complete genomic sequences of 80 Eubacteria and 16 Archaea, to infer weak and strong genome tendencies toward content bias, translational bias, and strand bias. These criteria can be applied to all microbial genomes, even those for which little biological information is known, and a codon bias signature, that is the collection of strong biases displayed by a genome, can be automatically derived. A codon bias space, where genomes are identified by their preferred codons, is proposed as a novel formal framework to interpret genomic relationships. Principal component analysis confirms that although GC content has a dominant effect on codon bias space, thermophilic and mesophilic species can be identified and separated by codon preferences. Two more examples concerning lifestyle are studied with linear discriminant analysis: suitable separating functions characterized by sets of preferred codons are provided to discriminate: translationally biased (hyper)thermophiles from mesophiles, and organisms with different respiratory characteristics, aerobic, anaerobic, facultative aerobic and facultative anaerobic. These results suggest that codon bias space might reflect the geometry of a prokaryotic "physiology space." Evolutionary perspectives are noted, numerical criteria and distances among organisms are validated on known cases, and various results and predictions are discussed both on methodological and biological grounds.
基于密码子适应指数的新的简单数值标准应用于80种真细菌和16种古细菌的完整基因组序列,以推断基因组在含量偏差、翻译偏差和链偏差方面的强弱倾向。这些标准可应用于所有微生物基因组,即使是那些生物学信息知之甚少的基因组,并且可以自动得出密码子偏差特征,即基因组所显示的强烈偏差的集合。提出了一个密码子偏差空间,其中基因组由其偏好密码子来识别,作为解释基因组关系的一个新的形式框架。主成分分析证实,虽然GC含量对密码子偏差空间有主导作用,但嗜热和嗜温物种可以通过密码子偏好来识别和区分。用线性判别分析研究了另外两个关于生活方式的例子:提供了以偏好密码子集为特征的合适的分离函数,以区分:翻译偏向的(超)嗜热菌和嗜温菌,以及具有不同呼吸特征的生物,需氧、厌氧、兼性需氧和兼性厌氧。这些结果表明,密码子偏差空间可能反映了原核生物“生理空间”的几何形状。文中指出了进化观点,在已知案例上验证了数值标准和生物体之间的距离,并从方法学和生物学角度讨论了各种结果和预测。