Wu Yufeng
Computer Science and Engineering Department, 371 Fairfield Road, Unit 2155, University of Connecticut Storrs, CT 06269.
J Comput Biol. 2013 Oct;20(10):792-804. doi: 10.1089/cmb.2013.0072.
A phylogenetic network is a model for reticulate evolution. A hybridization network is one type of phylogenetic network for a set of discordant gene trees and "displays" each gene tree. A central computational problem on hybridization networks is: given a set of gene trees, reconstruct the minimum (i.e., most parsimonious) hybridization network that displays each given gene tree. This problem is known to be NP-hard, and existing approaches for this problem are either heuristics or making simplifying assumptions (e.g., work with only two input trees or assume some topological properties). In this article, we develop an exact algorithm (called PIRNC) for inferring the minimum hybridization networks from multiple gene trees. The PIRNC algorithm does not rely on structural assumptions (e.g., the so-called galled networks). To the best of our knowledge, PIRNC is the first exact algorithm implemented for this formulation. When the number of reticulation events is relatively small (say, four or fewer), PIRNC runs reasonably efficient even for moderately large datasets. For building more complex networks, we also develop a heuristic version of PIRNC called PIRNCH. Simulation shows that PIRNCH usually produces networks with fewer reticulation events than those by an existing method. PIRNC and PIRNCH have been implemented as part of the software package called PIRN and is available online.
系统发育网络是一种用于网状进化的模型。杂交网络是针对一组不一致基因树的一种系统发育网络类型,并且“展示”每棵基因树。杂交网络的一个核心计算问题是:给定一组基因树,重建展示每棵给定基因树的最小(即最简约)杂交网络。已知这个问题是NP难问题,并且针对此问题的现有方法要么是启发式方法,要么是做出简化假设(例如,仅处理两棵输入树或假设某些拓扑性质)。在本文中,我们开发了一种精确算法(称为PIRNC),用于从多棵基因树推断最小杂交网络。PIRNC算法不依赖于结构假设(例如,所谓的有结网络)。据我们所知,PIRNC是针对此公式实现的首个精确算法。当网状化事件的数量相对较少(比如说,四个或更少)时,即使对于中等规模的数据集,PIRNC运行起来也相当高效。为了构建更复杂的网络,我们还开发了PIRNC的一个启发式版本,称为PIRNCH。模拟表明,PIRNCH通常产生的具有网状化事件的网络比现有方法产生的网络更少。PIRNC和PIRNCH已作为名为PIRN的软件包的一部分实现,并且可在线获取。