Gavryushkin Alex, Whidden Chris, Matsen Frederick A
Department of Biosystems Science and Engineering, ETH Zürich, 4058, Basel, Switzerland.
Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA.
J Math Biol. 2018 Apr;76(5):1101-1121. doi: 10.1007/s00285-017-1167-9. Epub 2017 Jul 29.
A time-tree is a rooted phylogenetic tree such that all internal nodes are equipped with absolute divergence dates and all leaf nodes are equipped with sampling dates. Such time-trees have become a central object of study in phylogenetics but little is known about the parameter space of such objects. Here we introduce and study a hierarchy of discrete approximations of the space of time-trees from the graph-theoretic and algorithmic point of view. One of the basic and widely used phylogenetic graphs, the [Formula: see text] graph, is the roughest approximation and bottom level of our hierarchy. More refined approximations discretize the relative timing of evolutionary divergence and sampling dates. We study basic graph-theoretic questions for these graphs, including the size of neighborhoods, diameter upper and lower bounds, and the problem of finding shortest paths. We settle many of these questions by extending the concept of graph grammars introduced by Sleator, Tarjan, and Thurston to our graphs. Although time values greatly increase the number of possible trees, we show that 1-neighborhood sizes remain linear, allowing for efficient local exploration and construction of these graphs. We also obtain upper bounds on the r-neighborhood sizes of these graphs, including a smaller bound than was previously known for [Formula: see text]. Our results open up a number of possible directions for theoretical investigation of graph-theoretic and algorithmic properties of the time-tree graphs. We discuss the directions that are most valuable for phylogenetic applications and give a list of prominent open problems for those applications. In particular, we conjecture that the split theorem applies to shortest paths in time-tree graphs, a property not shared in the general [Formula: see text] case.
时间树是一种有根的系统发育树,其中所有内部节点都配备了绝对分歧日期,所有叶节点都配备了采样日期。这种时间树已成为系统发育学研究的核心对象,但对于此类对象的参数空间却知之甚少。在这里,我们从图论和算法的角度介绍并研究时间树空间的离散近似层次结构。一种基本且广泛使用的系统发育图,即[公式:见正文]图,是我们层次结构中最粗略的近似和底层。更精细的近似会离散化进化分歧和采样日期的相对时间。我们研究这些图的基本图论问题,包括邻域大小、直径上下界以及寻找最短路径的问题。我们通过将Sleator、Tarjan和Thurston引入的图语法概念扩展到我们的图来解决其中许多问题。尽管时间值极大地增加了可能的树的数量,但我们表明1 - 邻域大小仍然是线性的,这使得对这些图进行高效的局部探索和构建成为可能。我们还获得了这些图的r - 邻域大小的上界,包括一个比之前已知的[公式:见正文]更小的界。我们的结果为时间树图的图论和算法性质的理论研究开辟了许多可能的方向。我们讨论了对系统发育应用最有价值的方向,并给出了这些应用中突出的开放问题列表。特别是,我们推测分裂定理适用于时间树图中的最短路径,这是一般[公式:见正文]情况所不具备的性质。