Suppr超能文献

最大似然分子钟梳:解析解

Maximum likelihood molecular clock comb: analytic solutions.

作者信息

Chor Benny, Khetan Amit, Snir Sagi

机构信息

School of Computer Science, Tel-Aviv University, Tel-Aviv 39040 Israel.

出版信息

J Comput Biol. 2006 Apr;13(3):819-37. doi: 10.1089/cmb.2006.13.819.

Abstract

Maximum likelihood (ML) is increasingly used as an optimality criterion for selecting evolutionary trees, but finding the global optimum is a hard computational task. Because no general analytic solution is known, numeric techniques such as hill climbing or expectation maximization (EM), are used in order to find optimal parameters for a given tree. So far, analytic solutions were derived only for the simplest model--three taxa, two state characters, under a molecular clock. Four taxa rooted trees have two topologies--the fork (two subtrees with two leaves each) and the comb (one subtree with three leaves, the other with a single leaf). In a previous work, we devised a closed form analytic solution for the ML molecular clock fork. In this work, we extend the state of the art in the area of analytic solutions ML trees to the family of all four taxa trees under the molecular clock assumption. The change from the fork topology to the comb incurs a major increase in the complexity of the underlying algebraic system and requires novel techniques and approaches. We combine the ultrametric properties of molecular clock trees with the Hadamard conjugation to derive a number of topology dependent identities. Employing these identities, we substantially simplify the system of polynomial equations. We finally use tools from algebraic geometry (e.g., Gröbner bases, ideal saturation, resultants) and employ symbolic algebra software to obtain analytic solutions for the comb. We show that in contrast to the fork, the comb has no closed form solutions (expressed by radicals in the input data). In general, four taxa trees can have multiple ML points. In contrast, we can now prove that under the molecular clock assumption, the comb has a unique (local and global) ML point. (Such uniqueness was previously shown for the fork.).

摘要

最大似然法(ML)越来越多地被用作选择进化树的最优性标准,但找到全局最优解是一项艰巨的计算任务。由于不存在通用的解析解,因此使用诸如爬山法或期望最大化(EM)等数值技术来为给定的树找到最优参数。到目前为止,仅针对最简单的模型——在分子钟假设下的三个分类单元、两个状态特征,推导出了解析解。四个分类单元的有根树有两种拓扑结构——叉形(两个子树,每个子树有两个叶子)和梳形(一个子树有三个叶子,另一个有一个叶子)。在之前的一项工作中,我们为最大似然分子钟叉形结构设计了一个封闭形式的解析解。在这项工作中,我们将最大似然树解析解领域的技术水平扩展到分子钟假设下所有四个分类单元树的家族。从叉形拓扑结构到梳形拓扑结构的转变导致基础代数系统的复杂性大幅增加,需要新的技术和方法。我们将分子钟树的超度量性质与哈达玛共轭相结合,推导出了一些依赖于拓扑结构的恒等式。利用这些恒等式,我们大幅简化了多项式方程组。我们最终使用代数几何工具(例如,格罗比纳基、理想饱和、结式)并使用符号代数软件来获得梳形结构的解析解。我们表明,与叉形结构不同,梳形结构没有封闭形式的解(用输入数据中的根式表示)。一般来说,四个分类单元的树可以有多个最大似然点。相比之下,我们现在可以证明,在分子钟假设下,梳形结构有一个唯一的(局部和全局)最大似然点。(之前已证明叉形结构具有这种唯一性。)

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验