Department of Computer Science, The University of Texas at Austin, Austin, TX 78712, USA and Departments of Computer Science and Bioengineering, The University of Illinois at Urbana-Champaign, Champaign, IL 61801, USA.
Bioinformatics. 2015 Jun 15;31(12):i44-52. doi: 10.1093/bioinformatics/btv234.
The estimation of species phylogenies requires multiple loci, since different loci can have different trees due to incomplete lineage sorting, modeled by the multi-species coalescent model. We recently developed a coalescent-based method, ASTRAL, which is statistically consistent under the multi-species coalescent model and which is more accurate than other coalescent-based methods on the datasets we examined. ASTRAL runs in polynomial time, by constraining the search space using a set of allowed 'bipartitions'. Despite the limitation to allowed bipartitions, ASTRAL is statistically consistent.
We present a new version of ASTRAL, which we call ASTRAL-II. We show that ASTRAL-II has substantial advantages over ASTRAL: it is faster, can analyze much larger datasets (up to 1000 species and 1000 genes) and has substantially better accuracy under some conditions. ASTRAL's running time is [Formula: see text], and ASTRAL-II's running time is [Formula: see text], where n is the number of species, k is the number of loci and X is the set of allowed bipartitions for the search space.
ASTRAL-II is available in open source at https://github.com/smirarab/ASTRAL and datasets used are available at http://www.cs.utexas.edu/~phylo/datasets/astral2/.
Supplementary data are available at Bioinformatics online.
由于不完全谱系分选,不同的基因座可能具有不同的树,因此物种系统发育的估计需要多个基因座,该模型由多物种合并模型建模。我们最近开发了一种基于合并的方法 ASTRAL,该方法在多物种合并模型下具有统计一致性,并且在我们检查的数据集上比其他基于合并的方法更准确。ASTRAL 通过使用一组允许的“二分法”来约束搜索空间,从而在多项式时间内运行。尽管受到允许二分法的限制,ASTRAL 仍然具有统计一致性。
我们提出了 ASTRAL 的新版本,称为 ASTRAL-II。我们表明,ASTRAL-II 具有明显优于 ASTRAL 的优势:它速度更快,可以分析更大的数据集(多达 1000 个物种和 1000 个基因),并且在某些条件下具有更高的准确性。ASTRAL 的运行时间为[公式:见正文],而 ASTRAL-II 的运行时间为[公式:见正文],其中 n 是物种数,k 是基因座数,X 是搜索空间的允许二分法集。
ASTRAL-II 可在开源网站 https://github.com/smirarab/ASTRAL 上获得,并且可在 http://www.cs.utexas.edu/~phylo/datasets/astral2/ 上获得使用的数据集。
补充数据可在 Bioinformatics 在线获得。