Lima-Mendez Gipsi
Laboratoire de Bioinformatique des Génomes et des Réseaux, Université Libre de Bruxelles, Bruxelles, Belgium.
Methods Mol Biol. 2012;804:81-91. doi: 10.1007/978-1-61779-361-5_5.
The tree of life is the classical representation of the evolutionary relationships between existent species. A tree is appropriate to display the divergence of species through mutation, i.e., by vertical descent. However, lateral gene transfer (LGT) is excluded from such representations. When LGT contribution to genome evolution cannot be neglected (e.g., for prokaryotes and mobile genetic elements), the tree becomes misleading. Networks appear as an intuitive way to represent both vertical and horizontal relationships, while overlapping groups within such graphs are more suitable for their classification. Here, we describe a method to represent both vertical and horizontal relationships. We start with a set of genomes whose coded proteins have been grouped into families based on sequence similarity. Next, all pairs of genomes are compared, counting the number of proteins classified into the same family. From this comparison, we derive a weighted graph where genomes with a significant number of similar proteins are linked. Finally, we apply a two-step clustering of this graph to produce a classification where nodes can be assigned to multiple clusters. The procedure can be performed using the Network Analysis Tools (NeAT) website.
生命之树是现存物种之间进化关系的经典表示。树适合通过突变,即通过垂直遗传来展示物种的分化。然而,横向基因转移(LGT)被排除在这种表示之外。当LGT对基因组进化的贡献不可忽视时(例如,对于原核生物和移动遗传元件),树就会产生误导。网络似乎是表示垂直和水平关系的直观方式,而此类图中的重叠组更适合用于它们的分类。在这里,我们描述了一种表示垂直和水平关系的方法。我们从一组基因组开始,其编码的蛋白质已根据序列相似性被分组到家族中。接下来,比较所有基因组对,计算分类到同一家族的蛋白质数量。通过这种比较,我们得出一个加权图,其中具有大量相似蛋白质的基因组相互连接。最后,我们对这个图应用两步聚类以产生一种分类,其中节点可以被分配到多个簇。该过程可以使用网络分析工具(NeAT)网站来执行。