Willems Matthieu, Tahiri Nadia, Makarenkov Vladimir
Département d'informatique, Université du Québec à Montréal, Case postale 8888, Succursale Centre-ville, Montréal (Québec) H3C 3P8, Canada.
J Bioinform Comput Biol. 2014 Oct;12(5):1450024. doi: 10.1142/S0219720014500243. Epub 2014 Sep 14.
Several algorithms and software have been developed for inferring phylogenetic trees. However, there exist some biological phenomena such as hybridization, recombination, or horizontal gene transfer which cannot be represented by a tree topology. We need to use phylogenetic networks to adequately represent these important evolutionary mechanisms. In this article, we present a new efficient heuristic algorithm for inferring hybridization networks from evolutionary distance matrices between species. The famous Neighbor-Joining concept and the least-squares criterion are used for building networks. At each step of the algorithm, before joining two given nodes, we check if a hybridization event could be related to one of them or to both of them. The proposed algorithm finds the exact tree solution when the considered distance matrix is a tree metric (i.e. it is representable by a unique phylogenetic tree). It also provides very good hybrids recovery rates for large trees (with 32 and 64 leaves in our simulations) for both distance and sequence types of data. The results yielded by the new algorithm for real and simulated datasets are illustrated and discussed in detail.
已经开发了几种用于推断系统发育树的算法和软件。然而,存在一些生物学现象,如杂交、重组或水平基因转移,这些现象无法用树形拓扑来表示。我们需要使用系统发育网络来充分表示这些重要的进化机制。在本文中,我们提出了一种新的高效启发式算法,用于从物种间的进化距离矩阵推断杂交网络。著名的邻接法概念和最小二乘法准则被用于构建网络。在算法的每一步,在连接两个给定节点之前,我们检查是否有一个杂交事件可能与其中一个节点或两个节点都相关。当所考虑的距离矩阵是树度量(即它可以由唯一的系统发育树表示)时,所提出的算法能找到精确的树形解。对于距离和序列类型的数据,它在大树(在我们的模拟中有32和64个叶节点)上也能提供非常好的杂交恢复率。详细说明了并讨论了新算法对真实和模拟数据集产生的结果。