Hueber Stefanie D, Frickey Tancred
Department of Biology, University of Konstanz, Konstanz 78464, Germany.
J Dev Biol. 2016 Feb 4;4(1):8. doi: 10.3390/jdb4010008.
Phylogenetic methods are key to providing models for how a given protein family evolved. However, these methods run into difficulties when sequence divergence is either too low or too high. Here, we provide a case study of Hox and ParaHox proteins so that additional insights can be gained using a new computational approach to help solve old classification problems. For two (Gsx and Cdx) out of three ParaHox proteins the assignments differ between the currently most established view and four alternative scenarios. We use a non-phylogenetic, pairwise-sequence-similarity-based method to assess which of the previous predictions, if any, are best supported by the sequence-similarity relationships between Hox and ParaHox proteins. The overall sequence-similarities show Gsx to be most similar to Hox2-3, and Cdx to be most similar to Hox4-8. The results indicate that a purely pairwise-sequence-similarity-based approach can provide additional information not only when phylogenetic inference methods have insufficient information to provide reliable classifications (as was shown previously for central Hox proteins), but also when the sequence variation is so high that the resulting phylogenetic reconstructions are likely plagued by long-branch-attraction artifacts.
系统发育方法是为给定蛋白质家族的进化提供模型的关键。然而,当序列差异过低或过高时,这些方法会遇到困难。在这里,我们提供了一个关于Hox和ParaHox蛋白的案例研究,以便通过一种新的计算方法获得更多见解,以帮助解决旧的分类问题。对于三种ParaHox蛋白中的两种(Gsx和Cdx),目前最权威的观点与四种替代方案之间的分类不同。我们使用一种基于非系统发育的、成对序列相似性的方法来评估之前的预测中哪一个(如果有的话)最能得到Hox和ParaHox蛋白之间序列相似性关系的支持。整体序列相似性表明Gsx与Hox2-3最相似,而Cdx与Hox4-8最相似。结果表明,基于成对序列相似性的方法不仅可以在系统发育推断方法没有足够信息提供可靠分类时(如之前针对中枢Hox蛋白所显示的那样)提供额外信息,而且当序列变异过高以至于由此产生的系统发育重建可能受到长枝吸引假象的困扰时也能提供额外信息。