BMC Bioinformatics. 2013;14 Suppl 15(Suppl 15):S6. doi: 10.1186/1471-2105-14-S15-S6. Epub 2013 Oct 15.
Phylogenomic analyses involving whole-genome or multi-locus data often entail dealing with incongruent gene trees. In this paper, we consider two causes of such incongruence, namely, incomplete lineage sorting (ILS) and hybridization, and consider both parsimony and probabilistic criteria for dealing with them.
Under the assumption of ILS, computing the probability of a gene tree given a species tree is a very hard problem. We present a heuristic for speeding up the computation, and demonstrate how it scales up computations to data sizes that are not feasible to analyze using current techniques, while achieving very good accuracy. Further, under the assumption of both ILS and hybridization, computing the probability of a gene tree and parsimoniously reconciling it with a phylogenetic network are both very hard problems. We present two exact algorithms for these two problems that speed up existing techniques significantly and enable analyses of much larger data sets than is currently feasible.
Our heuristics and algorithms enable phylogenomic analyses of larger (in terms of numbers of taxa) data sets than is currently feasible. Further, our methods account for ILS and hybridization, thus allowing analyses of reticulate evolutionary histories.
涉及全基因组或多基因座数据的系统发育基因组分析常常需要处理不一致的基因树。在本文中,我们考虑了导致这种不一致的两个原因,即不完全谱系分选(ILS)和杂交,并考虑了处理它们的简约性和概率标准。
在 ILS 的假设下,计算给定物种树的基因树的概率是一个非常困难的问题。我们提出了一种启发式算法来加速计算,并展示了如何将计算扩展到使用当前技术无法分析的大规模数据,同时实现非常高的准确性。此外,在 ILS 和杂交的假设下,计算基因树的概率并简约地将其与系统发育网络协调也是非常困难的问题。我们提出了两个用于这两个问题的精确算法,这些算法大大加快了现有技术的速度,并使分析比当前可行的更大规模的数据集成为可能。
我们的启发式算法和算法使更大(在分类单元数量方面)数据集的系统发育基因组分析成为可能。此外,我们的方法考虑了 ILS 和杂交,从而允许分析网状进化历史。