Moret Bernard M E, Nakhleh Luay, Warnow Tandy, Linder C Randal, Tholse Anna, Padolina Anneke, Sun Jerry, Timme Ruth
Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA.
IEEE/ACM Trans Comput Biol Bioinform. 2004 Jan-Mar;1(1):13-23. doi: 10.1109/TCBB.2004.10.
Phylogenetic networks model the evolutionary history of sets of organisms when events such as hybrid speciation and horizontal gene transfer occur. In spite of their widely acknowledged importance in evolutionary biology, phylogenetic networks have so far been studied mostly for specific data sets. We present a general definition of phylogenetic networks in terms of directed acyclic graphs (DAGs) and a set of conditions. Further, we distinguish between model networks and reconstructible ones and characterize the effect of extinction and taxon sampling on the reconstructibility of the network. Simulation studies are a standard technique for assessing the performance of phylogenetic methods. A main step in such studies entails quantifying the topological error between the model and inferred phylogenies. While many measures of tree topological accuracy have been proposed, none exist for phylogenetic networks. Previously, we proposed the first such measure, which applied only to a restricted class of networks. In this paper, we extend that measure to apply to all networks, and prove that it is a metric on the space of phylogenetic networks. Our results allow for the systematic study of existing network methods, and for the design of new accurate ones.
当诸如杂交物种形成和水平基因转移等事件发生时,系统发育网络可对生物群体的进化历史进行建模。尽管系统发育网络在进化生物学中的重要性已得到广泛认可,但迄今为止,大多是针对特定数据集对其进行研究。我们根据有向无环图(DAG)和一组条件给出了系统发育网络的一般定义。此外,我们区分了模型网络和可重建网络,并描述了灭绝和分类群抽样对网络可重建性的影响。模拟研究是评估系统发育方法性能的标准技术。此类研究的一个主要步骤是量化模型系统发育和推断系统发育之间的拓扑误差。虽然已经提出了许多衡量树拓扑准确性的方法,但对于系统发育网络却不存在这样的方法。此前,我们提出了第一种这样的方法,它仅适用于一类受限的网络。在本文中,我们将该方法扩展到适用于所有网络,并证明它是系统发育网络空间上的一种度量。我们的结果有助于对现有网络方法进行系统研究,并有助于设计新的精确方法。