Chor Benny, Hendy Michael D, Snir Sagi
School of Computer Science, Tel-Aviv University, Israel.
Mol Biol Evol. 2006 Mar;23(3):626-32. doi: 10.1093/molbev/msj069. Epub 2005 Nov 30.
Maximum likelihood (ML) is a popular method for inferring a phylogenetic tree of the evolutionary relationship of a set of taxa, from observed homologous aligned genetic sequences of the taxa. Generally, the computation of the ML tree is based on numerical methods, which in a few cases, are known to converge to a local maximum on a tree, which is suboptimal. The extent of this problem is unknown, one approach is to attempt to derive algebraic equations for the likelihood equation and find the maximum points analytically. This approach has so far only been successful in the very simplest cases, of three or four taxa under the Neyman model of evolution of two-state characters. In this paper we extend this approach, for the first time, to four-state characters, the Jukes-Cantor model under a molecular clock, on a tree T on three taxa, a rooted triple. We employ spectral methods (Hadamard conjugation) to express the likelihood function parameterized by the path-length spectrum. Taking partial derivatives, we derive a set of polynomial equations whose simultaneous solution contains all critical points of the likelihood function. Using tools of algebraic geometry (the resultant of two polynomials) in the computer algebra packages (Maple), we are able to find all turning points analytically. We then employ this method on real sequence data and obtain realistic results on the primate-rodents divergence time.
最大似然法(ML)是一种从一组分类单元的同源比对遗传序列推断其进化关系的系统发育树的常用方法。一般来说,最大似然树的计算基于数值方法,在某些情况下,已知这些方法会收敛到树上的局部最大值,而这是次优的。这个问题的严重程度尚不清楚,一种方法是尝试推导似然方程的代数方程并解析地找到最大值点。到目前为止,这种方法仅在非常简单的情况下取得了成功,即在两态性状的内曼进化模型下对三或四个分类单元的情况。在本文中,我们首次将这种方法扩展到四态性状,即在分子钟下的朱克斯 - 坎托模型,用于有三个分类单元的树(T)(一个有根三元组)。我们采用谱方法(哈达玛共轭)来表达由路径长度谱参数化的似然函数。通过求偏导数,我们推导了一组多项式方程,其联立解包含似然函数的所有临界点。使用计算机代数软件包(Maple)中的代数几何工具(两个多项式的结式),我们能够解析地找到所有转折点。然后我们将此方法应用于实际序列数据,并在灵长类 - 啮齿类分歧时间上获得了实际结果。