Adam Zaky, Turmel Monique, Lemieux Claude, Sankoff David
School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, Canada.
J Comput Biol. 2007 May;14(4):436-45. doi: 10.1089/cmb.2007.A005.
The common intervals of two permutations on n elements are the subsets of terms contiguous in both permutations. They constitute the most basic representation of conserved local order. We use d, the size of the symmetric difference (the complement of the common intervals) of the two subsets of 2({1,n}) thus determined by two permutations, as an evolutionary distance between the gene orders represented by the permutations. We consider the Steiner Tree problem in the space (2({1,n}), d) as the basis for constructing phylogenetic trees, including ancestral gene orders. We extend this to genomes with unequal gene content and to genomes containing gene families. Applied to streptophyte phylogeny, our method does not support the positioning of the complex algae Charales as a sister group to the land plants. Simulations show that the method, though unmotivated by any specific model of genome rearrangement, accurately reconstructs a tree from artificial genome data generated by random inversions deriving each genome from its ancestor on this tree.
n 个元素的两个排列的公共区间是两个排列中连续项的子集。它们构成了保守局部顺序的最基本表示。我们使用 d,即由两个排列确定的 2({1,n}) 的两个子集的对称差(公共区间的补集)的大小,作为排列所代表的基因顺序之间的进化距离。我们将空间 (2({1,n}), d) 中的斯坦纳树问题作为构建系统发育树(包括祖先基因顺序)的基础。我们将此扩展到基因含量不等的基因组以及包含基因家族的基因组。应用于链形植物系统发育时,我们的方法不支持将复杂藻类轮藻定位为陆地植物的姐妹群。模拟表明,该方法虽然没有基于任何特定的基因组重排模型,但能从通过随机倒位从该树上的祖先衍生出每个基因组而生成的人工基因组数据中准确地重建一棵树。