Wheeler Ward C
Division of Invertebrate Zoology, American Museum of Natural History, Central Park West @ 79th Street, New York, 10024-5192, NY, USA.
BMC Bioinformatics. 2015 Sep 17;16:296. doi: 10.1186/s12859-015-0675-0.
Many problems in comparative biology are, or are thought to be, best expressed as phylogenetic "networks" as opposed to trees. In trees, vertices may have only a single parent (ancestor), while networks allow for multiple parent vertices. There are two main interpretive types of networks, "softwired" and "hardwired." The parsimony cost of hardwired networks is based on all changes over all edges, hence must be greater than or equal to the best tree cost contained ("displayed") by the network. This is in contrast to softwired, where each character follows the lowest parsimony cost tree displayed by the network, resulting in costs which are less than or equal to the best display tree. Neither situation is ideal since hard-wired networks are not generally biologically attractive (since individual heritable characters can have more than one parent) and softwired networks can be trivially optimized (containing the best tree for each character). Furthermore, given the alternate cost scenarios of trees and these two flavors of networks, hypothesis testing among these explanatory scenarios is impossible.
A network cost adjustment (penalty) is proposed to allow phylogenetic trees and soft-wired phylogenetic networks to compete equally on a parsimony optimality basis. This cost is demonstrated for several real and simulated datasets. In each case, the favored graph representation (tree or network) matched expectation or simulation scenario.
The softwired network cost regime proposed here presents a quantitative criterion for an optimality-based search procedure where trees and networks can participate in hypothesis testing simultaneously.
比较生物学中的许多问题,或者被认为最好用系统发育“网络”来表示,而不是树形图。在树形图中,顶点可能只有一个父节点(祖先),而网络允许有多个父顶点。网络主要有两种解释类型,“软连接”和“硬连接”。硬连接网络的简约成本基于所有边的所有变化,因此必须大于或等于网络所包含(“展示”)的最佳树形图成本。这与软连接网络相反,在软连接网络中,每个特征遵循网络展示的最低简约成本树形图,导致成本小于或等于最佳展示树形图。这两种情况都不理想,因为硬连接网络通常在生物学上缺乏吸引力(因为个体可遗传特征可以有多个父节点),而软连接网络可以很容易地进行优化(包含每个特征的最佳树形图)。此外,鉴于树形图和这两种网络的不同成本情况,在这些解释性情况之间进行假设检验是不可能的。
提出了一种网络成本调整(惩罚)方法,以使系统发育树和软连接系统发育网络在简约最优性基础上能够平等竞争。针对几个真实和模拟数据集证明了这种成本。在每种情况下,偏好的图形表示(树形图或网络)都符合预期或模拟情况。
这里提出的软连接网络成本机制为基于最优性的搜索过程提供了一个定量标准,在这个过程中,树形图和网络可以同时参与假设检验。