Suppr超能文献

SVDquartets与其他基于溯祖理论的物种树估计方法的比较研究。

A comparative study of SVDquartets and other coalescent-based species tree estimation methods.

作者信息

Chou Jed, Gupta Ashu, Yaduvanshi Shashank, Davidson Ruth, Nute Mike, Mirarab Siavash, Warnow Tandy

出版信息

BMC Genomics. 2015;16 Suppl 10(Suppl 10):S2. doi: 10.1186/1471-2164-16-S10-S2. Epub 2015 Oct 2.

Abstract

BACKGROUND

Species tree estimation is challenging in the presence of incomplete lineage sorting (ILS), which can make gene trees different from the species tree. Because ILS is expected to occur and the standard concatenation approach can return incorrect trees with high support in the presence of ILS, "coalescent-based" summary methods (which first estimate gene trees and then combine gene trees into a species tree) have been developed that have theoretical guarantees of robustness to arbitrarily high amounts of ILS. Some studies have suggested that summary methods should only be used on "c-genes" (i.e., recombination-free loci) that can be extremely short (sometimes fewer than 100 sites). However, gene trees estimated on short alignments can have high estimation error, and summary methods tend to have high error on short c-genes. To address this problem, Chifman and Kubatko introduced SVDquartets, a new coalescent-based method. SVDquartets takes multi-locus unlinked single-site data, infers the quartet trees for all subsets of four species, and then combines the set of quartet trees into a species tree using a quartet amalgamation heuristic. Yet, the relative accuracy of SVDquartets to leading coalescent-based methods has not been assessed.

RESULTS

We compared SVDquartets to two leading coalescent-based methods (ASTRAL-2 and NJst), and to concatenation using maximum likelihood. We used a collection of simulated datasets, varying ILS levels, numbers of taxa, and number of sites per locus. Although SVDquartets was sometimes more accurate than ASTRAL-2 and NJst, most often the best results were obtained using ASTRAL-2, even on the shortest gene sequence alignments we explored (with only 10 sites per locus). Finally, concatenation was the most accurate of all methods under low ILS conditions.

CONCLUSIONS

ASTRAL-2 generally had the best accuracy under higher ILS conditions, and concatenation had the best accuracy under the lowest ILS conditions. However, SVDquartets was competitive with the best methods under conditions with low ILS and small numbers of sites per locus. The good performance under many conditions of ASTRAL-2 in comparison to SVDquartets is surprising given the known vulnerability of ASTRAL-2 and similar methods to short gene sequences.

摘要

背景

在存在不完全谱系分选(ILS)的情况下,物种树估计具有挑战性,因为ILS会使基因树与物种树不同。由于预计会发生ILS,并且在存在ILS的情况下标准的串联方法可能会返回支持度很高的错误树,因此已经开发了“基于合并”的汇总方法(该方法首先估计基因树,然后将基因树合并为物种树),这些方法在理论上保证了对任意大量ILS的稳健性。一些研究表明,汇总方法仅应应用于可能极短(有时少于100个位点)的“c基因”(即无重组位点)。然而,在短比对上估计的基因树可能具有较高的估计误差,并且汇总方法在短c基因上往往具有较高的误差。为了解决这个问题,奇夫曼和库巴特科引入了SVDquartets,这是一种新的基于合并的方法。SVDquartets采用多位点不连锁单一位点数据,推断所有四个物种子集的四重奏树,然后使用四重奏合并启发式方法将四重奏树集合并为物种树。然而,尚未评估SVDquartets相对于领先的基于合并的方法的相对准确性。

结果

我们将SVDquartets与两种领先的基于合并的方法(ASTRAL-2和NJst)以及使用最大似然法的串联方法进行了比较。我们使用了一组模拟数据集,这些数据集的ILS水平、分类单元数量和每个位点的位点数量各不相同。尽管SVDquartets有时比ASTRAL-2和NJst更准确,但大多数情况下,即使在我们探索的最短基因序列比对(每个位点只有10个位点)上,使用ASTRAL-2也能获得最佳结果。最后,在低ILS条件下,串联是所有方法中最准确的。

结论

在较高的ILS条件下,ASTRAL-2通常具有最佳的准确性,而在最低的ILS条件下,串联具有最佳的准确性。然而,在低ILS和每个位点的位点数量较少的条件下,SVDquartets与最佳方法具有竞争力。鉴于ASTRAL-2和类似方法对短基因序列的已知脆弱性,ASTRAL-2在许多条件下与SVDquartets相比的良好性能令人惊讶。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e58/4602346/f4abd63266bf/1471-2164-16-S10-S2-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验