Calamoneri T, Monti A, Sinaimeri B
Computer Science Department, Sapienza University of Rome, Rome, Italy.
INRIA Grenoble Rhône-Alpes, Grenoble, France.
J Math Biol. 2019 Aug;79(3):1149-1167. doi: 10.1007/s00285-019-01385-w. Epub 2019 Jun 15.
In reconstructing the common evolutionary history of hosts and parasites, the current method of choice is the phylogenetic tree reconciliation. In this model, we are given a host tree H, a parasite tree P, and a function [Formula: see text] mapping the leaves of P to the leaves of H and the goal is to find, under some biologically motivated constraints, a reconciliation, that is a function from the vertices of P to the vertices of H that respects [Formula: see text] and allows the identification of biological events such as co-speciation, duplication and host switch. The maximum co-divergence problem consists in finding the maximum number of co-speciations in a reconciliation. This problem is NP-hard for arbitrary phylogenetic trees and no approximation algorithm is known. In this paper we consider the influence of tree topology on the maximum co-divergence problem. In particular we focus on a particular tree structure, namely caterpillar, and show that-in this case-the heuristics that are mostly used in the literature provide solutions that can be arbitrarily far from the optimal value. Then, we prove that finding the max co-divergence is equivalent to compute the maximum length of a subsequence with certain properties of a given permutation. This equivalence leads to two consequences: (1) it shows that we can compute efficiently in polynomial time the optimal time-feasible reconciliation and (2) it can be used to understand how much the tree topology influences the value of the maximum number of co-speciations.
在重构宿主和寄生虫的共同进化历史时,当前首选的方法是系统发育树协调。在这个模型中,我们有一个宿主树H、一个寄生虫树P,以及一个将P的叶子映射到H的叶子的函数[公式:见原文],目标是在一些生物学动机的约束下找到一种协调,即一个从P的顶点到H的顶点的函数,该函数尊重[公式:见原文],并允许识别诸如共物种形成、复制和宿主转换等生物学事件。最大共分歧问题在于在一种协调中找到最大数量的共物种形成。对于任意系统发育树,这个问题是NP难的,并且不存在已知的近似算法。在本文中,我们考虑树拓扑结构对最大共分歧问题的影响。特别地,我们关注一种特定的树结构,即毛毛虫树,并表明在这种情况下,文献中最常用的启发式方法提供的解决方案可能与最优值相差任意远。然后,我们证明找到最大共分歧等同于计算给定排列具有某些属性的子序列的最大长度。这种等价性导致两个结果:(1)它表明我们可以在多项式时间内有效地计算最优的时间可行协调;(2)它可用于理解树拓扑结构对共物种形成最大数量值的影响程度。