Su Zhuo, Townsend Jeffrey P
Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, 06520, USA.
Department of Biostatistics, Yale University, New Haven, CT, 06520, USA.
BMC Evol Biol. 2015 May 14;15:86. doi: 10.1186/s12862-015-0364-7.
The detection and avoidance of "long-branch effects" in phylogenetic inference represents a longstanding challenge for molecular phylogenetic investigations. A consequence of parallelism and convergence, long-branch effects arise in phylogenetic inference when there is unequal molecular divergence among lineages, and they can positively mislead inference based on parsimony especially, but also inference based on maximum likelihood and Bayesian approaches. Long-branch effects have been exhaustively examined by simulation studies that have compared the performance of different inference methods in specific model trees and branch length spaces.
In this paper, by generalizing the phylogenetic signal and noise analysis to quartets with uneven subtending branches, we quantify the utility of molecular characters for resolution of quartet phylogenies via parsimony. Our quantification incorporates contributions toward the correct tree from either signal or homoplasy (i.e. "the right result for either the right reason or the wrong reason"). We also characterize a highly conservative lower bound of utility that incorporates contributions to the correct tree only when they correspond to true, unobscured parsimony-informative sites (i.e. "the right result for the right reason"). We apply the generalized signal and noise analysis to classic quartet phylogenies in which long-branch effects can arise due to unequal rates of evolution or an asymmetrical topology. Application of the analysis leads to identification of branch length conditions in which inference will be inconsistent and reveals insights regarding how to improve sampling of molecular loci and taxa in order to correctly resolve phylogenies in which long-branch effects are hypothesized to exist.
The generalized signal and noise analysis provides analytical prediction of utility of characters evolving at diverse rates of evolution to resolve quartet phylogenies with unequal branch lengths. The analysis can be applied to identifying characters evolving at appropriate rates to resolve phylogenies in which long-branch effects are hypothesized to occur.
在系统发育推断中检测和避免“长枝效应”是分子系统发育研究长期面临的挑战。作为平行进化和趋同进化的结果,当谱系间存在不等的分子分歧时,系统发育推断中就会出现长枝效应,它们尤其会对基于简约法的推断产生误导,对基于最大似然法和贝叶斯方法的推断也会产生误导。通过模拟研究,在特定模型树和分支长度空间中比较不同推断方法的性能,对长枝效应进行了详尽的研究。
在本文中,通过将系统发育信号和噪声分析推广到具有不等对向分支的四重奏,我们通过简约法量化了分子特征对四重奏系统发育分辨率的效用。我们的量化纳入了来自信号或同塑性(即“出于正确原因或错误原因得到的正确结果”)对正确树的贡献。我们还刻画了一个高度保守的效用下限,该下限仅在与真实、未混淆的简约信息位点相对应时才纳入对正确树的贡献(即“出于正确原因得到的正确结果”)。我们将广义信号和噪声分析应用于经典四重奏系统发育,其中由于进化速率不等或拓扑不对称可能出现长枝效应。该分析的应用导致识别出推断将不一致的分支长度条件,并揭示了关于如何改进分子位点和分类单元采样以正确解析假设存在长枝效应的系统发育的见解。
广义信号和噪声分析提供了对以不同进化速率进化的特征用于解析具有不等分支长度的四重奏系统发育的效用的分析预测。该分析可用于识别以适当速率进化的特征,以解析假设发生长枝效应的系统发育。