Xiangtan University, Hunan, China.
Int J Mol Sci. 2010 Mar 18;11(3):1141-54. doi: 10.3390/ijms11031141.
A shortcoming of most correlation distance methods based on the composition vectors without alignment developed for phylogenetic analysis using complete genomes is that the "distances" are not proper distance metrics in the strict mathematical sense. In this paper we propose two new correlation-related distance metrics to replace the old one in our dynamical language approach. Four genome datasets are employed to evaluate the effects of this replacement from a biological point of view. We find that the two proper distance metrics yield trees with the same or similar topologies as/to those using the old "distance" and agree with the tree of life based on 16S rRNA in a majority of the basic branches. Hence the two proper correlation-related distance metrics proposed here improve our dynamical language approach for phylogenetic analysis.
基于完整基因组的系统发育分析中,不进行比对的组成向量相关距离方法存在一个缺点,即“距离”在严格的数学意义上不是合适的距离度量。在本文中,我们提出了两种新的相关距离度量来替代我们动态语言方法中的旧距离度量。使用四个基因组数据集从生物学角度评估了这种替代的效果。我们发现,这两种适当的距离度量得出的树与使用旧“距离”的树具有相同或相似的拓扑结构,并在大多数基本分支上与基于 16S rRNA 的生命之树一致。因此,这里提出的两种适当的相关距离度量方法改进了我们用于系统发育分析的动态语言方法。