Escobar-Páramo Patricia, Sabbagh Audrey, Darlu Pierre, Pradillon Olivier, Vaury Christelle, Denamur Erick, Lecointre Guillaume
INSERM E0339, IFR 02, Faculté de Médecine Xavier Bichat, 16 rue Henri Huchard, Paris 75018, France.
Mol Phylogenet Evol. 2004 Jan;30(1):243-50. doi: 10.1016/s1055-7903(03)00181-7.
Phylogenetic reconstructions of bacterial species from DNA sequences are hampered by the existence of horizontal gene transfer. One possible way to overcome the confounding influence of such movement of genes is to identify and remove sequences which are responsible for significant character incongruence when compared to a reference dataset free of horizontal transfer (e.g., multilocus enzyme electrophoresis, restriction fragment length polymorphism, or random amplified polymorphic DNA) using the incongruence length difference (ILD) test of Farris et al. [Cladistics 10 (1995) 315]. As obtaining this "whole genome dataset" prior to the reconstruction of a phylogeny is clearly troublesome, we have tested alternative approaches allowing the release from such reference dataset, designed for a species with modest level of horizontal gene transfer, i.e., Escherichia coli. Eleven different genes available or sequenced in this work were studied in a set of 30 E. coli reference (ECOR) strains. Either using ILD to test incongruence between each gene against the all remaining (in this case 10) genes in order to remove sequences responsible for significant incongruence, or using just a simultaneous analysis without removals, gave robust phylogenies with slight topological differences. The use of the ILD test remains a suitable method for estimating the level of horizontal gene transfer in bacterial species. Supertrees also had suitable properties to extract the phylogeny of strains, because the way they summarize taxonomic congruence clearly limits the impact of individual gene transfers on the global topology. Furthermore, this work allowed a significant improvement of the accuracy of the phylogeny within E. coli.
DNA序列中细菌物种的系统发育重建受到水平基因转移的阻碍。克服基因这种移动带来的混杂影响的一种可能方法是,使用Farris等人的不一致长度差异(ILD)检验[《分支系统学》10(1995年)315页],与不含水平转移的参考数据集(例如多位点酶电泳、限制性片段长度多态性或随机扩增多态性DNA)相比,识别并去除导致显著性状不一致的序列。由于在系统发育重建之前获得这个“全基因组数据集”显然很麻烦,我们测试了其他方法,以便从为水平基因转移水平适中的物种(即大肠杆菌)设计的参考数据集中解脱出来。在一组30个大肠杆菌参考(ECOR)菌株中研究了本研究中可用或测序的11个不同基因。要么使用ILD检验每个基因与所有其余(在这种情况下为10个)基因之间的不一致性,以去除导致显著不一致的序列,要么仅进行同时分析而不进行去除,都能得到拓扑结构略有差异的稳健系统发育树。使用ILD检验仍然是估计细菌物种水平基因转移水平的合适方法。超树也具有提取菌株系统发育的合适特性,因为它们总结分类一致性的方式明显限制了单个基因转移对全局拓扑结构的影响。此外,这项工作显著提高了大肠杆菌内部系统发育的准确性。