Zhu Jiafan, Yu Yun, Nakhleh Luay
Department of Computer Science, Rice University, Houston, 77005, Texas, USA.
Department of BioSciences, Rice University, Houston, 77005, Texas, USA.
BMC Bioinformatics. 2016 Nov 11;17(Suppl 14):415. doi: 10.1186/s12859-016-1269-1.
Phylogenetic networks model reticulate evolutionary histories. The last two decades have seen an increased interest in establishing mathematical results and developing computational methods for inferring and analyzing these networks. A salient concept underlying a great majority of these developments has been the notion that a network displays a set of trees and those trees can be used to infer, analyze, and study the network.
In this paper, we show that in the presence of coalescence effects, the set of displayed trees is not sufficient to capture the network. We formally define the set of parental trees of a network and make three contributions based on this definition. First, we extend the notion of anomaly zone to phylogenetic networks and report on anomaly results for different networks. Second, we demonstrate how coalescence events could negatively affect the ability to infer a species tree that could be augmented into the correct network. Third, we demonstrate how a phylogenetic network can be viewed as a mixture model that lends itself to a novel inference approach via gene tree clustering.
Our results demonstrate the limitations of focusing on the set of trees displayed by a network when analyzing and inferring the network. Our findings can form the basis for achieving higher accuracy when inferring phylogenetic networks and open up new venues for research in this area, including new problem formulations based on the notion of a network's parental trees.
系统发育网络对网状进化历史进行建模。在过去二十年中,人们对建立数学结果以及开发用于推断和分析这些网络的计算方法的兴趣日益增加。这些发展背后的一个显著概念是,网络展示了一组树,并且这些树可用于推断、分析和研究网络。
在本文中,我们表明在存在合并效应的情况下,展示的树集不足以捕获网络。我们正式定义了网络的亲本树集,并基于此定义做出了三项贡献。第一,我们将异常区的概念扩展到系统发育网络,并报告不同网络的异常结果。第二,我们展示了合并事件如何对推断可扩展为正确网络的物种树的能力产生负面影响。第三,我们展示了系统发育网络如何被视为一种混合模型,这使其适合通过基因树聚类的新推断方法。
我们的结果证明了在分析和推断网络时专注于网络展示的树集的局限性。我们的发现可为在推断系统发育网络时实现更高的准确性奠定基础,并为该领域的研究开辟新的途径,包括基于网络亲本树概念的新问题表述。