IEEE/ACM Trans Comput Biol Bioinform. 2019 Jan-Feb;16(1):14-22. doi: 10.1109/TCBB.2018.2849732. Epub 2018 Jun 22.
Phylogenetic tree reconciliation is widely used in the fields of molecular evolution, cophylogenetics, parasitology, and biogeography to study the evolutionary histories of pairs of entities. In these contexts, reconciliation is often performed using maximum parsimony under the Duplication-Transfer-Loss (DTL) event model. In general, the number of maximum parsimony reconciliations (MPRs) can grow exponentially with the size of the trees. While a number of previous efforts have been made to count the number of MPRs, find representative MPRs, and compute the frequencies of events across the space of MPRs, little is known about the structure of MPR space. In particular, how different are MPRs in terms of the events that they comprise? One way to address this question is to compute the diameter of MPR space, defined to be the maximum number of DTL events that distinguish any two MPRs in the solution space. We show how to compute the diameter of MPR space in polynomial time and then apply this algorithm to a large biological dataset to study the variability of events.
系统发育树协调广泛应用于分子进化、共进化、寄生虫学和生物地理学领域,用于研究对实体的进化历史。在这些上下文中,通常使用复制-转移-丢失 (DTL) 事件模型在最大简约法下进行协调。通常,最大简约协调 (MPR) 的数量会随着树的大小呈指数增长。虽然之前已经有一些努力来计算 MPR 的数量、找到代表 MPR 的方法以及计算 MPR 空间中事件的频率,但对 MPR 空间的结构知之甚少。特别是,MPR 在组成它们的事件方面有何不同?解决这个问题的一种方法是计算 MPR 空间的直径,定义为区分解决方案空间中任何两个 MPR 的最大 DTL 事件数。我们展示了如何在多项式时间内计算 MPR 空间的直径,然后将此算法应用于大型生物数据集以研究事件的可变性。