Hansmann S, Martin W
Int J Syst Evol Microbiol. 2000 Jul;50 Pt 4:1655-1663. doi: 10.1099/00207713-50-4-1655.
Thirty-nine proteins encoded in a large gene cluster that is well-conserved in gene content and gene order across 18 sequenced prokaryotic genomes were extracted, aligned and subjected to phylogenetic analysis. In individual analyses of the alignments, only two probable examples of lateral gene transfer between archaea and eubacteria were detected, involving the genes for ribosomal protein Rpl23 and adenylate kinase. Amino acid sequences for 35 of the 39 proteins were concatenated to yield a data set of 9087 amino acid positions per genome. Many of these proteins, 33 of which are ribosomal proteins, are not highly conserved across distantly related organisms and thus contain many regions that are difficult to align. Phylogenetic analyses were performed with subsets of the concatenated data from which the most highly variable sites had been iteratively removed, using the number of different amino acids that occur at a given site as a criterion of variability. Glycine, which has a strong influence on protein structure, tended to be more frequent at the most conserved (least polymorphic) sites. With most subsets of the data, the proteins from the cyanobacterium Synechocystis tended to branch with their homologues from gram-positive bacteria. The results indicate that excluding only a few percentage of poorly alignable sites from phylogenetic analysis can have a severe impact upon the phylogeny inferred and that bootstrap support for branches can fluctuate substantially, depending upon which sites are excluded.
从一个在18个已测序原核生物基因组中基因含量和基因顺序都高度保守的大基因簇中提取出39种蛋白质,对其进行比对并进行系统发育分析。在比对的单独分析中,仅检测到古细菌和真细菌之间两个可能的横向基因转移实例,涉及核糖体蛋白Rpl23和腺苷酸激酶的基因。将39种蛋白质中的35种的氨基酸序列连接起来,得到每个基因组9087个氨基酸位置的数据集。这些蛋白质中的许多,其中33种是核糖体蛋白,在远缘相关生物中并非高度保守,因此包含许多难以比对的区域。使用在给定位置出现的不同氨基酸数量作为变异性标准,对连接数据的子集进行系统发育分析,其中最具变异性的位点已被反复去除。对蛋白质结构有强烈影响的甘氨酸,在最保守(多态性最低)的位点往往更频繁出现。对于大多数数据子集,蓝藻集胞藻中的蛋白质倾向于与其来自革兰氏阳性细菌的同源物分支。结果表明,在系统发育分析中仅排除少数百分比难以比对的位点,就可能对推断的系统发育产生严重影响,并且分支的自展支持可能会大幅波动,这取决于排除哪些位点。