University of East Anglia, Norwich, UK.
Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany.
Bull Math Biol. 2022 Sep 15;84(10):119. doi: 10.1007/s11538-022-01081-9.
In evolutionary studies, it is common to use phylogenetic trees to represent the evolutionary history of a set of species. However, in case the transfer of genes or other genetic information between the species or their ancestors has occurred in the past, a tree may not provide a complete picture of their history. In such cases, tree-based phylogenetic networks can provide a useful, more refined representation of the species' evolution. Such a network is essentially a phylogenetic tree with some arcs added between the tree's edges so as to represent reticulate events such as gene transfer, hybridization and recombination. Even so, this model does not permit the direct representation of evolutionary scenarios where reticulate events have taken place between different subfamilies or lineages of species. To represent such scenarios, in this paper we introduce the notion of a forest-based network, that is, a collection of leaf-disjoint phylogenetic trees on a set of species with arcs added between the edges of distinct trees within the collection. Forest-based networks include the recently introduced class of overlaid species forests which can be used to model introgression. As we shall see, even though the definition of forest-based networks is closely related to that of tree-based networks, they lead to new mathematical theory which complements that of tree-based networks. As well as studying the relationship of forest-based networks with other classes of phylogenetic networks, such as tree-child networks and universal tree-based networks, we present some characterizations of some special classes of forest-based networks. We expect that our results will be useful for developing new models and algorithms to understand reticulate evolution, such as introgression and gene transfer between species.
在进化研究中,通常使用系统发育树来表示一组物种的进化历史。然而,在过去的某个时间点,物种或其祖先之间发生了基因或其他遗传信息的转移,那么树状图可能无法完整地反映它们的历史。在这种情况下,基于树的系统发育网络可以提供更精细的物种进化表示。这种网络本质上是一个带有一些边之间的弧的系统发育树,以表示基因转移、杂交和重组等网状事件。即便如此,这种模型仍然不允许直接表示在不同亚科或物种谱系之间发生了网状事件的进化场景。为了表示这种场景,在本文中,我们引入了基于森林的网络的概念,即一组物种上的叶不相交的系统发育树集合,其中在集合中不同树的边缘之间添加了弧。基于森林的网络包括最近引入的重叠物种森林类,可用于模拟基因渗入。正如我们将看到的,尽管基于森林的网络的定义与基于树的网络的定义密切相关,但它们导致了新的数学理论,补充了基于树的网络的理论。除了研究基于森林的网络与其他类别的系统发育网络(如树孩子网络和通用基于树的网络)之间的关系外,我们还对某些特殊类别的基于森林的网络进行了一些特征描述。我们希望我们的结果将有助于开发新的模型和算法,以理解网状进化,如物种之间的基因渗入和基因转移。