Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA.
Bioinformatics. 2010 Jun 15;26(12):i140-8. doi: 10.1093/bioinformatics/btq198.
Reticulate network is a model for displaying and quantifying the effects of complex reticulate processes on the evolutionary history of species undergoing reticulate evolution. A central computational problem on reticulate networks is: given a set of phylogenetic trees (each for some region of the genomes), reconstruct the most parsimonious reticulate network (called the minimum reticulate network) that combines the topological information contained in the given trees. This problem is well-known to be NP-hard. Thus, existing approaches for this problem either work with only two input trees or make simplifying topological assumptions.
We present novel results on the minimum reticulate network problem. Unlike existing approaches, we address the fully general problem: there is no restriction on the number of trees that are input, and there is no restriction on the form of the allowed reticulate network. We present lower and upper bounds on the minimum number of reticulation events in the minimum reticulate network (and infer an approximately parsimonious reticulate network). A program called PIRN implements these methods, which also outputs a graphical representation of the inferred network. Empirical results on simulated and biological data show that our methods are practical for a wide range of data. More importantly, the lower and upper bounds match for many datasets (especially when the number of trees is small or reticulation level is low), and this allows us to solve the minimum reticulate network problem exactly for these datasets.
A software tool, PIRN, is available for download from the web page: http://www.engr.uconn.edu/~ywu.
Supplementary data is available at Bioinformatics online.
网状网络是一种用于显示和量化复杂网状过程对经历网状进化的物种进化历史的影响的模型。网状网络上的一个核心计算问题是:给定一组系统发育树(每个树代表基因组的某些区域),重建最简约的网状网络(称为最小网状网络),该网络组合了给定树中包含的拓扑信息。这个问题是众所周知的 NP 难问题。因此,现有的这个问题的方法要么只处理两个输入树,要么做出简化的拓扑假设。
我们提出了最小网状网络问题的新结果。与现有的方法不同,我们解决了完全通用的问题:输入的树的数量没有限制,允许的网状网络的形式也没有限制。我们提出了最小网状网络中最少的网状事件数的下界和上界(并推断出一个近似简约的网状网络)。一个名为 PIRN 的程序实现了这些方法,该程序还输出推断出的网络的图形表示。在模拟和生物数据上的实验结果表明,我们的方法适用于广泛的数据。更重要的是,对于许多数据集,下界和上界都匹配(尤其是当树的数量较少或网状程度较低时),这使我们能够为这些数据集精确地解决最小网状网络问题。
一个名为 PIRN 的软件工具可从网页下载:http://www.engr.uconn.edu/~ywu。
补充数据可在 Bioinformatics 在线获得。