Steel M A, Fu Y X
Department of Mathematics and Statistics, University of Canterbury, Christchurch, New Zealand.
J Comput Biol. 1995 Spring;2(1):39-47. doi: 10.1089/cmb.1995.2.39.
Linear invariants are useful tools for testing phylogenetic hypotheses from aligned DNA/RNA sequences, particularly when the sites evolve at different rates. Here we give a simple, graph theoretic classification for each phylogenetic tree T, of its associated vector space I(T) of linear invariants under the Jukes-Cantor one-parameter model of nucleotide substitution. We also provide an easily described basis for I(T), and show that if I is a binary (fully resolved) phylogenetic tree with n sequences at its leaves then: dim[I(T)] = 4n-F2n-2 where Fn is the nth Fibonacci number. Our method applies a recently developed Hadamard matrix-based technique to describe elements of I(T) in terms of edge-disjoint packings of subtrees in T, and thereby complements earlier more algebraic treatments.
线性不变量是用于从比对的DNA/RNA序列检验系统发育假说的有用工具,特别是当位点以不同速率进化时。在此,我们针对每个系统发育树T,给出其在核苷酸替换的Jukes-Cantor单参数模型下相关的线性不变量向量空间I(T)的一种简单的图论分类。我们还为I(T)提供了一个易于描述的基,并表明如果I是一个在其叶节点处有n个序列的二叉(完全解析)系统发育树,那么:dim[I(T)] = 4n - F2n - 2,其中Fn是第n个斐波那契数。我们的方法应用了一种最近开发的基于哈达玛矩阵的技术,根据T中各子树的边不相交填充来描述I(T)的元素,从而补充了早期更多的代数处理方法。