Carmel Amir, Musa-Lempel Noa, Tsur Dekel, Ziv-Ukelson Michal
Department of Computer Science, Ben-Gurion University of the Negev , Beer Sheva, Israel .
J Comput Biol. 2014 Nov;21(11):799-808. doi: 10.1089/cmb.2014.0128. Epub 2014 Oct 10.
One of the core classical problems in computational biology is that of constructing the most parsimonious phylogenetic tree interpreting an input set of sequences from the genomes of evolutionarily related organisms. We reexamine the classical maximum parsimony (MP) optimization problem for the general (asymmetric) scoring matrix case, where rooted phylogenies are implied, and analyze the worst case bounds of three approaches to MP: The approach of Cavalli-Sforza and Edwards, the approach of Hendy and Penny, and a new agglomerative, "bottom-up" approach we present in this article. We show that the second and third approaches are faster than the first one by a factor of Θ(√n) and Θ(n), respectively, where n is the number of species.
计算生物学中的一个核心经典问题是构建最简约的系统发育树,以解释来自进化相关生物体基因组的一组输入序列。我们重新审视一般(不对称)评分矩阵情况下的经典最大简约(MP)优化问题,其中隐含着有根系统发育树,并分析MP的三种方法的最坏情况界限:卡瓦利 - 斯福尔扎和爱德华兹的方法、亨迪和彭尼的方法,以及我们在本文中提出的一种新的凝聚式“自下而上”方法。我们表明,第二种和第三种方法分别比第一种方法快Θ(√n)和Θ(n)倍,其中n是物种数量。